GBR24 GBR24 - 1 year ago 122
Python Question

Method to sort values in row in pandas Series?

Consider the following


import pandas as pd

s = pd.Series(["hello there you would like to sort me", "sorted i would like to be", "the yankees played the red sox", "apple apple banana fruit orange cucumber"])

I would like to sort the values inside each row, similar to the following approach:

for row in s.index:
split_words = s.loc[row].split()
s.loc[row] = " ".join(split_words)

I have a huge dataset, however, so vectorization is important, here. How can I use pandas
attribute to accomplish the same, but much quicker?

Answer Source

use the string accessor str and split. Then apply sorted and join.

s.str.split().apply(sorted).str.join(' ')

0       hello like me sort there to would you
1                   be i like sorted to would
2              played red sox the the yankees
3    apple apple banana cucumber fruit orange
dtype: object