SUNDONG SUNDONG - 3 months ago 15
Python Question

Sort by certain order (Situation: pandas DataFrame Groupby)

I want to change the day of order presented by below code.

What I want is a result with the order (Mon, Tue, Wed, Thu, Fri, Sat, Sun)
- should I say, sort by key in certain predefined order?




Here is my code which needs some tweak:

f8 = df_toy_indoor2.groupby(['device_id', 'day'])['dwell_time'].sum()

print(f8)


Current result:

device_id day
device_112 Thu 436518
Wed 636451
Fri 770307
Tue 792066
Mon 826862
Sat 953503
Sun 1019298
device_223 Mon 2534895
Thu 2857429
Tue 3303173
Fri 3548178
Wed 3822616
Sun 4213633
Sat 4475221


Desired result:

device_id day
device_112 Mon 826862
Tue 792066
Wed 636451
Thu 436518
Fri 770307
Sat 953503
Sun 1019298
device_223 Mon 2534895
Tue 3303173
Wed 3822616
Thu 2857429
Fri 3548178
Sat 4475221
Sun 4213633





Here,
type(df_toy_indoor2.groupby(['device_id', 'day'])['dwell_time'])
is a class 'pandas.core.groupby.SeriesGroupBy'.

I have found
.sort_values()
, but it is a built-in sort function by values.

I want to get some pointers to set some order to use it further data manipulation.

Thanks in advance.

Answer

Took me some time, but I found the solution. reindex does what you want. See my code example:

a = [1, 2] * 2 + [2, 1] * 3 + [1, 2]
b = ['Mon', 'Wed', 'Thu', 'Fri'] * 3
c = list(range(12))
df = pd.DataFrame(data=[a,b,c]).T
df.columns = ['device', 'day', 'value']
df = df.groupby(['device', 'day']).sum()

gives:

            value
device day       
1      Fri      7
       Mon      0
       Thu     12
       Wed     14
2      Fri     14
       Mon     12
       Thu      6
       Wed      1

Then doing reindex:

df.reindex(['Mon', 'Wed', 'Thu', 'Fri'], level='day')

or more conveniently (credits to burhan)

df.reindex(list(calendar.day_abbr), level='day')

gives:

            value
device day       
1      Mon      0
       Wed     14
       Thu     12
       Fri      7
2      Mon     12
       Wed      1
       Thu      6
       Fri     14