Steven G Steven G - 2 months ago 5
Python Question

manipulation of a list of strings containing digits to output a list of of digits

I looking for help in manipulating a list of strings where I want to extract the digits such has :

x = ['aa bb qq 2 months 60%', 'aa bb qq 3 months 70%', 'aa bb qq 1 month 80%']


I am trying to get to :

[[2.0,60.0],[3.0,70.0],[1.0,80.0]]


in a elegant fashion.

The first number should always be an integer but the second number can be a float with a decimal value

my dirty work around is this:

x_split = [y.replace("%", "").split() for y in x]
x_float = [[float(s) for s in x if s.isdigit()] for x in x_split]

Out[100]: [[2.0, 60.0], [3.0, 70.0], [1.0, 80.0]]

Answer

Use a regular expression to match integers and floats.

import re
[[float(n) for n in re.findall(r'\d+\.?\d*', s)] for s in x]

Explanation for the regex (r'\d+\.?\d*'):

r    #  a raw string so that back slashes are not converted  
\d   #  digit 0 to 9
+    #  one or more of the previous pattern (\d)
\.   #  a decimal point
?    #  zero or one of the previous pattern (\.)
\d   #  digit 0 to 9
*    #  zero or more of the previous pattern (\d)
Comments