matlabit matlabit - 1 month ago 5
Python Question

Apply "list" function on multiple columns pandas

In order to "concatenate" a few rows to 1 list with groupby in Pandas, I can do this:

df = pd.DataFrame({'A': [1,1,2,2,2,2,3],'B':['a','b','c','d','e','f','g']})

df = df.groupby('A')['B'].apply(list)


I will get:

A
-------------------
1 [a, b]
2 [c, d, e, f]
3 [g]


I want to do the same with agg:

f = {"B":[list]}
df = df.groupby('A').agg(f)


that gives errors,
any idea?

Thanks,

Answer

You can use tolist - output is Series:

df = df.groupby('A')['B'].agg(lambda x: x.tolist())
print (df)
A
1          [a, b]
2    [c, d, e, f]
3             [g]
dtype: object

Or with define column B in dict - output is DataFrame:

df = df.groupby('A').agg({'B': lambda x: x.tolist()})
print (df)
              B
A              
1        [a, b]
2  [c, d, e, f]
3           [g]

Also works:

df = df.groupby('A')['B'].agg(lambda x: list(x))
print (df)
A
1          [a, b]
2    [c, d, e, f]
3             [g]
dtype: object

df = df.groupby('A').agg({'B': lambda x: list(x)})
print (df)
              B
A              
1        [a, b]
2  [c, d, e, f]
3           [g]