FrankBr FrankBr - 14 days ago 5
Python Question

Applying group aggregation over multiindex

I'm playing around with Pandas.
Given

day pokemon date cp
14 Abra 2016-11-14 14:08:37.205617 377
2016-11-14 22:47:02.467526 374
Bellsprout 2016-11-14 09:02:41.420506 460
2016-11-14 09:31:29.026961 541
2016-11-14 09:42:49.151360 125


I'd like to add a new colum for each pokemon value where the group relate cp mean is inserted.
As you probably think, this is a multiIndex structure where (day, pokemon, date) is the index tuple.

Until now I tried to solve this problem merging this dataframe with the one obtained by grouping by day and pokemon and applying the mean operation. A a result, I loose the date field and I'm still not able to merge it the one I posted above.

My expected result would be something like this:

day pokemon date cp mean
14 Abra 2016-11-14 14:08:37.205617 377
2016-11-14 22:47:02.467526 374 375.5
Bellsprout 2016-11-14 09:02:41.420506 460
2016-11-14 09:31:29.026961 541
2016-11-14 09:42:49.151360 125 375.3


How would you solve this?
Thanks, FB

Answer

I think you need transform first and then add NaN by boolean indexing with mask by inverting duplicated:

g = df.groupby(level=[0,1])
df['mean'] = g['cp'].transform('mean')
df['mean'] = df['mean'][g['mean'].apply(lambda x: ~x.duplicated(keep='last'))]
print (df)
                                            cp        mean
day pokemon    date                                       
14  Abra       2016-11-14 14:08:37.205617  377         NaN
               2016-11-14 22:47:02.467526  374  375.500000
    Bellsprout 2016-11-14 09:02:41.420506  460         NaN
               2016-11-14 09:31:29.026961  541         NaN
               2016-11-14 09:42:49.151360  125  375.333333
Comments