owise owise - 1 year ago 85
Python Question

Explaining Grouping in Pandas in a way similar to the group function in dyplr in R

Simple question that I could not find an answer for:

Why when we group a panda df by a varaibel and then we sort the result why dont we see the grouped rosw togather like the case in the group function dplyr in R?

For examaple, I have this data frame:

Item Type Price
A 1 22
B 1 58
C 1 33
A 2 80
A 3 50
B 2 98
C 3 63
B 5 8


If we group by
item
and then sort by
Price
, we should see the 'A's togather, 'B's togather, and the two 'C's togather where each of these three groups are sorted. How can we accomplish this in python?

I tried this:

df.groupby('Item').sort_values(['Price']) # This is not right becuase we can not access the sort function on the grouped by object

df.sort_values('Price').groupby(['Item']) # This does part of the job, but I wnder why I can not see the groupped items togather?


The output is expected to look like this:

Item Type Price
A 2 80
A 3 50
A 1 22
B 2 98
B 1 58
B 5 8
C 3 63
C 1 33

Answer Source

To get your output, you can use df.sort_values:

In [783]: df.sort_values(['Item', 'Price'],  ascending=[True, False])
Out[783]: 
  Item  Type  Price
3    A     2     80
4    A     3     50
0    A     1     22
5    B     2     98
1    B     1     58
7    B     5      8
6    C     3     63
2    C     1     33

A groupby is not needed.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download