user3059024 user3059024 - 7 days ago 5
Python Question

How to correctly sort a multi-indexed pandas DataFrame

I have a multi-indexed pandas dataframe that looks like this:

Antibody Time Repeats
Akt 0 1 1.988053
2 1.855905
3 1.416557
5 1 1.143599
2 1.151358
3 1.272172
10 1 1.765615
2 1.779330
3 1.752246
20 1 1.685807
2 1.688354
3 1.614013
..... ....
0 4 2.111466
5 1.933589
6 1.336527
5 4 2.006936
5 2.040884
6 1.430818
10 4 1.398334
5 1.594028
6 1.684037
20 4 1.529750
5 1.721385
6 1.608393


(Note that I've only posted one
antibody
, there are many analogous entries under the
antibody
index) but they all have the same format. Despite missing out the entries in the middle for the sake of space you can see that I have 6 experimental repeats but they are not organized properly. My question is: how would I get the DataFrame to aggregate all the repeats. So the output would look something like this:

Antibody Time Repeats
Akt 0 1 1.988053
2 1.855905
3 1.416557
4 2.111466
5 1.933589
6 1.336527
5 1 1.143599
2 1.151358
3 1.272172
4 2.006936
5 2.040884
6 1.430818
10 1 1.765615
2 1.779330
3 1.752246
4 1.398334
5 1.594028
6 1.684037
20 1 1.685807
2 1.688354
3 1.614013
4 1.529750
5 1.721385
6 1.60839
..... ....


Thanks in advance

Answer

I think you need sort_index:

df = df.sort_index(level=[0,1,2])
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64

Or you can omit parameter levels:

df = df.sort_index()
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64
Comments