L. Joe L. Joe - 6 months ago 19
Python Question

Pandas replace type issue

I have a pandas dataframe with a row that contains data such as:

1 year
1 month
1 week
4 year
3 week


etc etc

I am trying to replace anything that contains "month" or "week" to 0

train_df.age["weeks" in train_df.age] = 0


and

for i in train_df['age']:
if "weeks" in i:
i = "0"


None of which seem to work.

Any advice on how to do this?
Thanks.

Answer

Use str.contains:

train_df.loc[train_df['age'].str.contains(r'week|month'), 'age'] = 0

Here we pass a regex pattern that looks for whether the row contains either 'week' or 'month' and use the boolean mask to selectively update just the rows on interest:

In [4]:
df.loc[df['age'].str.contains(r'week|month'), 'age'] = 0
df

Out[4]:
    age
1  year
1     0
1     0
4  year
3     0