Georg Heiler Georg Heiler - 1 month ago 14
Python Question

pandas multi index sort specific fields

I obtained a multi index in pandas by running series.describe() for a grouped dataframe. How can I sort these series by

modelName.mean
and only keep sepcific fields?multi index
This

summary.sortlevel(1)['kappa']


sorts them but retains all the other fields like count. How can I only keep
mean
and
std
?

Answer

You can do it this way:

Data:

In [77]: df
Out[77]:
                        0
level_1 level_0
a       25%      2.000000
        50%      4.000000
        75%      7.000000
        count    5.000000
        max      7.000000
        mean     4.400000
        min      2.000000
        std      2.509980
b       25%      2.000000
        50%      6.000000
        75%      8.000000
        count    5.000000
        max      8.000000
        mean     5.000000
        min      1.000000
        std      3.316625
c       25%      3.000000
        50%      4.000000
        75%      5.000000
        count    5.000000
        max      8.000000
        mean     4.000000
        min      0.000000
        std      2.915476
d       25%      4.000000
        50%      8.000000
        75%      8.000000
        count    5.000000
        max      9.000000
        mean     6.000000
        min      1.000000
        std      3.391165

Solution:

In [78]: df.loc[pd.IndexSlice[:, ['mean','std']], :]
Out[78]:
                        0
level_1 level_0
a       mean     4.400000
        std      2.509980
b       mean     5.000000
        std      3.316625
c       mean     4.000000
        std      2.915476
d       mean     6.000000
        std      3.391165

Setup:

df = (pd.DataFrame(np.random.randint(0,10,(5,4)),columns=list('abcd'))
        .describe()
        .stack()
        .reset_index()
        .set_index(['level_1','level_0'])
        .sort_index()
)
Comments