Noobie Noobie - 2 months ago 11
Python Question

how to slice and combine specific row values in a Pandas groupby?

Consider the following datafr

df = pd.DataFrame({'group1' : ['A', 'A', 'A', 'A',
'A', 'A', 'A', 'A'],
'group2' : ['C', 'C', 'C', 'C',
'C', 'E', 'E', 'E'],
'time' : [-6,-5,-4,-3,-2,-6,-3,-4] ,
'col': [1,2,3,4,5,6,7,8]})

df
Out[36]:
col group1 group2 time
0 1 A C -6
1 2 A C -5
2 3 A C -4
3 4 A C -3
4 5 A C -2
5 6 A E -6
6 7 A E -3
7 8 A E -4


my objective is to create a column that contains, for each group in
['group1','group2']
the ratio of
col
evaluated at
time = -6
divided by
col
evaluated at
time = -4
.

That is, for group
['A','C']
, I expect this column to be equal to 1/3, for group
['A','E']
it is 6/8. Both
group1
and
group1
take on many different values in the data.

How can I get that in Pandas?

Something like

df.groupby(['group1','group2']).transform(lambda x: x.ix[x['time'] == -6,'col'] / x.ix[x['time'] == -4,'col'])


does not work..
Any ideas?

Thanks!

Answer

You could do it without groupby like this:

dfm = pd.merge(df[df.time == -4],df[df.time == -6],on=["group1","group2"])
dfm['Div'] = dfm.col_y.div(dfm.col_x)
df = pd.merge(df,dfm[['group1','group2','Div']],on=["group1","group2"])

Output:

   col group1 group2  time       Div
0    1      A      C    -6  0.333333
1    2      A      C    -5  0.333333
2    3      A      C    -4  0.333333
3    4      A      C    -3  0.333333
4    5      A      C    -2  0.333333
5    6      A      E    -6  0.750000
6    7      A      E    -3  0.750000
7    8      A      E    -4  0.750000
Comments