Ион Сынкетру Ион Сынкетру - 5 months ago 35
Python Question

String replacement with pandas

I have a pandas column with some strings values like:

White bear
Brown Bear
Brown Bear 100 Kg
White bear 200 cm

How to check all the strings if they contain the sequence 'White bear' and replace the entire value (not only the sequence) with a string like 'White_bear'?

df['Species'] = df['Species'].str.replace('White bear', 'White_bear')

did not work right for me because it replaces only the sequence.


you can use boolean indexing:

In [173]: df.loc[df.Species.str.contains(r'\bWhite\s+bear\b'), 'Species'] = 'White_bear'

In [174]: df
0         White_bear
1         Brown Bear
2  Brown Bear 100 Kg
3         White_bear

or bit more general solution:

In [204]: df
0         White bear
1         Brown Bear
2  Brown Bear 100 Kg
3  White bear 200 cm

In [205]: from_re = [r'.*?\bwhite\b\s+\bbear\b.*',r'.*?\bbrown\b\s+\bbear\b.*']

In [206]: to_re = ['White_bear','Brown_bear']

In [207]: df.Species = df.Species.str.lower().replace(from_re, to_re, regex=True)

In [208]: df
0  White_bear
1  Brown_bear
2  Brown_bear
3  White_bear