user49007 user49007 - 2 months ago 18
Python Question

str.replace starting from the back in pandas DataFrame

I have two columns like so:

string s
0 the best new york cheesecake new york ny new york
1 houston public school houston houston


I want to remove the last occurrence of
s
in
string
. For context, my DataFrame has hundreds of thousands of rows. I know about
str.replace
and
str.rfind
, but nothing that does the desired combination of both, and I'm coming up blank in improvising a solution.

Thanks in advance for any help!

Answer Source

You can use rsplit and join:

df.apply(lambda x: ''.join(x['string'].rsplit(x['s'],1)),axis=1)

Output:

0    the best new york cheesecake  ny
1              houston public school 
dtype: object

edit:

df['string'] = df.apply(lambda x: ''.join(x['string'].rsplit(x['s'],1)),axis=1).str.replace('\s\s',' ')

print(df)

Output:

                            string         s  third
0  the best new york cheesecake ny  new york      1
1           houston public school    houston      1