Dude Dude - 7 months ago 16
Python Question

How to use group by pandas to fetch a column corresponding to a max column?

My data set looks like this

playerid,position,points
1,pos1,10
1,pos2,15
1,pos3,4


I want to return the position for players for which they scored most points.
I can group_by to find most points for a particular player, but how do I get the position?

data.groupby(['playerid']).agg(np.max)['points']


I want to return both playerid and position

1,pos2

Answer

one way (among many) to do that:

In [133]: df
Out[133]:
   playerid position  points
0         1     pos1      10
1         1     pos2      15
2         1     pos3       4
3         2     pos1       2
4         2     pos2      10
5         2     pos3      18

In [134]: df.loc[df.groupby('playerid')['points'].idxmax(), ['playerid','position']]
Out[134]:
   playerid position
1         1     pos2
5         2     pos3