whytheq whytheq - 2 months ago 8
Python Question

Applying different aggregate functions when using pivot_table

I have this sample:

import pandas as pd
import numpy as np
dic = {'name':
['j','c','q','j','c','q','j','c','q'],
'foo or bar':['foo','bar','bar','bar','foo','foo','bar','foo','foo'],
'amount':[10,20,30, 20,30,40, 200,300,400]}
x = pd.DataFrame(dic)
x
pd.pivot_table(x,
values='amount',
index='name',
columns='foo or bar',
aggfunc=[np.mean, np.sum])


It returns this:

enter image description here

I'd like to just have the highlighted columns. Why can I not specify tuples in the aggfunc argument like this?

pd.pivot_table(x,
values='amount',
index='name',
columns='foo or bar',
aggfunc=[(np.mean, 'bar'), (np.sum, 'foo')])


Is using
.ix
like here (define aggfunc for each values column in pandas pivot table) the only option?

Answer

i think you can't specify tuples for the aggfunc parameter, but you can do something like this:

In [259]: p = pd.pivot_table(x,
   .....:                values='amount',
   .....:                index='name',
   .....:                columns='foo or bar',
   .....:                aggfunc=[np.mean, np.sum])

In [260]: p
Out[260]:
           mean       sum
foo or bar  bar  foo  bar  foo
name
c            20  165   20  330
j           110   10  220   10
q            30  220   30  440

In [261]: p.columns = ['{0[0]}_{0[1]}'.format(col) if col[1] else col[0] for col in p.columns.tolist()]

In [262]: p.columns
Out[262]: Index(['mean_bar', 'mean_foo', 'sum_bar', 'sum_foo'], dtype='object')

In [264]: p[['mean_bar','sum_foo']]
Out[264]:
      mean_bar  sum_foo
name
c           20      330
j          110       10
q           30      440
Comments