SirTobi SirTobi - 1 year ago 73
Python Question

Pandas Dataframe ordering and sorting of column values

I was wondering if someone knows a good way on how to sort a pandas dataframe in the following way:

a) I have the following randomly sorted data with an id that appears multiple times and a label that is either 0 or 1:


id | label
------ | ------
1 | 1
1 | 0
1 | 0
2 | 1
2 | 0
2 | 0
3 | 0
3 | 0
3 | 0


I would like to sort the labels in ascending order and then also sort the id's in ascending order, but not grouped, so like this:


id | label
------ | ------
1 | 0
2 | 0
3 | 0
1 | 0
2 | 0
3 | 0
3 | 0
1 | 1
2 | 1


Thanks in advance!

Answer Source

First sort by id and label, then use cumcount to create an index representing 1,2,3 groups, then sort on index and by labels.

df_out = df.sort_values(by=['id','label'])\
  .set_index(df.groupby('id').cumcount())\
  .sort_index()\
  .sort_values(by='label')

Output:

   id  label
0   1      0
0   2      0
0   3      0
1   1      0
1   2      0
1   3      0
2   3      0
2   1      1
2   2      1
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download