billboard billboard - 1 month ago 13
Python Question

How do I apply a regex substitution in a string column

I have a data frame with a column like below

Years in current job
< 1 year
10+ years
9 years
1 year


I want to use regex or any other technique in python to get the result as

Years in current job
1
10
9
1


I got something like this but, i guess it can be done in a better way using regex

frame["Years in current job"] = frame["Years in current job"].str.replace(" ","")
frame["Years in current job"] = frame["Years in current job"].str.replace("<","")
frame["Years in current job"] = frame["Years in current job"].str.replace("year","")
frame["Years in current job"] = frame["Years in current job"].str.replace("years","")

Answer
df['Years in current job'] = df['Years in current job'].str.replace('\D+', '').astype('int')

Regex \D+ search non-digits (and replace with empty string)


I found this on SO: http://stackoverflow.com/a/22591024/1832058

Comments