Miyashita Hikaru Miyashita Hikaru - 3 months ago 5
Python Question

grouping and summing multiindex dataframe in pandas

I have a dataframe like this.

df1=pd.DataFrame({"A":np.random.randint(1,10,4),"B":np.random.randint(1,10,4),"C":list('abba')})
df1.index.name="first"
df2=pd.DataFrame({"A":np.random.randint(1,10,5),"B":np.random.randint(1,10,5),"C":list('aaabb')})
df2.index.name="second"
df=pd.concat([df1,df2], keys=['first', 'second'])
df
A B C
first 0 6 5 a
1 2 2 b
2 1 6 b
3 6 9 a
second 0 6 6 a
1 9 9 a
2 8 4 a
3 7 2 b
4 9 8 b


I would like to get grouping and summing result like this.
the (key= column "C")

first second
A B A B
a 15 14 23 19
b 3 8 16 10


How can I get this result ?

Answer

You can use groupby with a list of things that look like arrays. You want to use the first level of the index and column 'C'.

df.groupby([df.index.get_level_values(0), df.C]).sum() \
    .unstack().stack(0).T.rename_axis(None)

enter image description here