brian_the_bungler brian_the_bungler - 11 days ago 5
Python Question

python pandas custom agg function

Dataframe:
one two
a 1 x
b 1 y
c 2 y
d 2 z
e 3 z

grp = DataFrame.groupby('one')
grp.agg(lambda x: ???) #or equivalent function


Desired output from grp.agg:

one two
1 x|y
2 y|z
3 z


My agg function before integrating dataframes was
"|".join(sorted(set(x)))
. Ideally I want to have any number of columns in the group and agg returns the
"|".join(sorted(set())
for each column item like two above. I also tried
np.char.join()
.

Love Pandas and it has taken me from a 800 line complicated program to a 400 line walk in the park that zooms. Thank you :)

Answer

You were so close:

In [1]: df.groupby('one').agg(lambda x: "|".join(x.tolist()))
Out[1]:
     two
one
1    x|y
2    y|z
3      z

Expanded answer to handle sorting and take only the set:

In [1]: df = DataFrame({'one':[1,1,2,2,3], 'two':list('xyyzz'), 'three':list('eecba')}, index=list('abcde'), columns=['one','two','three'])

In [2]: df
Out[2]:
   one two three
a    1   x     e
b    1   y     e
c    2   y     c
d    2   z     b
e    3   z     a

In [3]: df.groupby('one').agg(lambda x: "|".join(x.order().unique().tolist()))
Out[3]:
     two three
one
1    x|y     e
2    y|z   b|c
3      z     a