user3376169 user3376169 - 1 month ago 15
Python Question

How to aggregate multiple columns in pandas groupby

I have created a pandas dataframe mn using following input:

keyA state n1 n2 d1 d2
key1 CA 100 1000 1 2
key2 FL 200 2000 2 4
key1 CA 300 3000 3 6
key1 AL 400 4000 4 8
key2 FL 500 5000 5 2
key1 NY 600 6000 6 4
key2 CA 700 7000 7 6


Have created a sum object as follows :

s = mn.groupby(['keyA','state'], as_index=False).sum()


How do I iterate the sum object
s
, so I can get following output:

The v1 column in the result below is computed as
s['n1']/s['d1']


The v2 column in the result below is computed as
s['n2']/s['d2']


keyA state v1 v2
'key1','AL',100,500
'key1','CA',100,500
'key1','NY',100,1500
'key2','CA',100,1166
'key2','FL',100,1166

Answer

Pretty much just write it like your pseudocode.

In [14]: s = mn.groupby(['keyA','state'], as_index=False).sum()

In [15]: s['v1'] = s['n1'] / s['d1']

In [16]: s['v2'] = s['n2'] / s['d2']

In [17]: s[['keyA', 'state', 'v1', 'v2']]
Out[17]: 
   keyA state   v1           v2
0  key1    AL  100   500.000000
1  key1    CA  100   500.000000
2  key1    NY  100  1500.000000
3  key2    CA  100  1166.666667
4  key2    FL  100  1166.666667

[5 rows x 4 columns]

I think you have a typo in your example data by the way. The second n1 header should be n2.

Comments