nos nos - 4 months ago 9
Python Question

python pandas reattach column after aggregation

My DataFrame looks like this

exams = pd.DataFrame({'id1':['1x', '1x','2x','3x','3x'], 'id2':['a','a','b','a','a'],'data':[1,2,3,4,5]})

id1 id2 data
0 1x a 1
1 1x a 2
2 2x b 3
3 3x a 4
4 3x a 5


Then I aggregate it to

exams_agg = exams.groupby('id1').agg('mean')


Then
exams_agg
looks like

data
id1
1x 1.5
2x 3
3x 4.5


I want to reattach
id2
column to
exams_agg
. So I was thinking about create a lookup table

lookup = exams[['id1', 'id2']]
exams_agg = pd.merge(exams_agg, lookup, left_index=True, right_on='id1')


But since
lookup
contains duplicate pairs of ids,
exams_agg
contains duplicates as well. What is a good way to create

data id2
id1
1x 1.5 a
2x 3 b
3x 4.5 a

3kt 3kt
Answer

If a unique id1 always corresponds to the same id2, you can simply add id2 in your groupby :

In [5]: df.groupby(['id1', 'id2']).agg('mean')
Out[5]: 
         data
id1 id2      
1x  a     1.5
2x  b     3.0
3x  a     4.5