The Cat The Cat - 2 years ago 171
Python Question

Groupby column and find min and max of each group

I have the following dataset,

Day Element Data_Value
6786 01-01 TMAX 112
9333 01-01 TMAX 101
9330 01-01 TMIN 60
11049 01-01 TMIN 0
6834 01-01 TMIN 25
11862 01-01 TMAX 113
1781 01-01 TMAX 115
11042 01-01 TMAX 105
1110 01-01 TMAX 111
651 01-01 TMIN 44
11350 01-01 TMIN 83
1798 01-02 TMAX 70
4975 01-02 TMAX 79
12774 01-02 TMIN 0
3977 01-02 TMIN 60
2485 01-02 TMAX 73
4888 01-02 TMIN 31
11836 01-02 TMIN 26
11368 01-02 TMAX 71
2483 01-02 TMIN 26


I want to group by the Day and then find the overall min of TMIN an the max of TMAX and put these in to a data frame, so I get an output like...

Day DayMin DayMax
01-01 0 115
01-02 0 79


I know I need to do,

df.groupby(by='Day')


but I am a stuck with the next step - should create columns to store the TMAX and TMIN values?

Answer Source

You can use a assign + abs, followed by groupby + agg:

df = df.assign(Data_Value=df.Data_Value.abs())\
       .groupby(['Day']).Data_Value.agg([('Min' , 'min'), ('Max', 'max')])\
       .add_prefix('Day')

df 
       DayMin  DayMax
Day                  
01-01       0     115
01-02       0      79
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download