chase chase - 1 month ago 12
Python Question

New column in pandas - adding series to dataframe by applying a list groupby

Best description is just to show what I am looking for:

Id 'other' 'concat'
A 'z' 1
A 'y' 2
B 'x' 3
B 'w' 4
B 'v' 5
B 'u' 6


needs to be

Id 'other' 'concat'
A 'z' [1,2]
A 'y' [1,2]
B 'x' [3,4,5,6]
B 'w' [3,4,5,6]
B 'v' [3,4,5,6]
B 'u' [3,4,5,6]


This is similar to these questions:

grouping rows in list in pandas groupby

Replicating GROUP_CONCAT for pandas.DataFrame

However, it is apply the grouping you get from
df.groupby('Id')['concat'].apply(list)
, which is a
Series
of smaller size than the dataframe, to the original dataframe.

I have tried the code below, but it does not apply this to the dataframe:

import pandas as pd
df = pd.DataFrame( {'Id':['A','A','B','B','B','C'], 'other':['z','y','x','w','v','u'], 'concat':[1,2,5,5,4,6]})
df.groupby('Id')['concat'].apply(list)


I know that
transform
can be used to apply groupings to dataframes, but it does not work in this case.

>>> df['new_col'] = df.groupby('Id')['concat'].transform(list)
>>> df
Id concat other new_col
0 A 1 z 1
1 A 2 y 2
2 B 5 x 5
3 B 5 w 5
4 B 4 v 4
5 C 6 u 6
>>> df['new_col'] = df.groupby('Id')['concat'].apply(list)
>>> df
Id concat other new_col
0 A 1 z NaN
1 A 2 y NaN
2 B 5 x NaN
3 B 5 w NaN
4 B 4 v NaN
5 C 6 u NaN

Answer

groupby with join

df.join(df.groupby('Id').concat.apply(list).to_frame('new'), on='Id')

enter image description here

Comments