Manuel Zompetta Manuel Zompetta - 1 month ago 6
Python Question

pandas sorting pivot_table or grouping dataframe?

I have a problem. I did this:

In [405]: pippo=ass_t1.pivot_table(['Rotazioni a volume','Distribuzione Ponderata'],rows=['SEGM1','DESC']).sort()

In [406]: pippo
Out[406]:
Distribuzione Ponderata Rotazioni a volume
SEGM1 DESC
AD ACCADINAROLO 74.040 140249.693409
ZYMIL AMALAT Z 90.085 321529.053570
FUN SPECIALMALAT S 88.650 120711.182177
NORM STD INNAROLO 49.790 162259.216710
STD P.NAROLO 52.125 1252174.695695
STD PLNAROLO 54.230 213257.829615
BONTA' MALAT B 79.280 520454.366419
DA STD RILGARD 35.290 554927.497875
OVANE VT.MANTO 15.040 466232.639628
WEIGHT MALAT W 79.170 118628.572692


My goal is to have each 'SEGM1' sorted by 'Distribuzione Ponderata'. E.g. in 'NORM' subset the first row should be "BONTA' MALAT B" with the higher level of 'Distribuzione Ponderata'.
I was able to achieve partially the result using groupby method but without being able to set multiple columns.
Someone can help me please?

Answer
import io
import pandas as pd
import numpy as np

text = '''\
SEGM1\tDESC\tDistribuzione Ponderata\tRotazioni a volume
AD\tACCADINAROLO\t74.040\t140249.693409
AD\tZYMIL AMALAT Z\t90.085\t321529.053570
FUN\tSPECIALMALAT S\t88.650\t120711.182177
NORM\tSTD INNAROLO\t49.790\t162259.216710
NORM\tSTD P.NAROLO\t52.125\t1252174.695695
NORM\tSTD PLNAROLO\t54.230\t213257.829615
NORM\tBONTA' MALAT B\t79.280\t520454.366419
NORM\tDA STD RILGARD\t35.290\t554927.497875
NORM\tOVANE VT.MANTO\t15.040\t466232.639628
NORM\tWEIGHT MALAT W\t79.170\t118628.572692
'''

df = pd.read_csv(io.BytesIO(text), delimiter = '\t',
                 index_col = (0,1),)

key1 = df.index.labels[0]
key2 = df['Distribuzione Ponderata'].rank(ascending=False)
sorter = np.lexsort((key2, key1))

sorted_df = df.take(sorter)
print(sorted_df)

yields

                      Distribuzione Ponderata  Rotazioni a volume
SEGM1 DESC                                                       
AD    ZYMIL AMALAT Z                   90.085       321529.053570
      ACCADINAROLO                     74.040       140249.693409
FUN   SPECIALMALAT S                   88.650       120711.182177
NORM  BONTA' MALAT B                   79.280       520454.366419
      WEIGHT MALAT W                   79.170       118628.572692
      STD PLNAROLO                     54.230       213257.829615
      STD P.NAROLO                     52.125      1252174.695695
      STD INNAROLO                     49.790       162259.216710
      DA STD RILGARD                   35.290       554927.497875
      OVANE VT.MANTO                   15.040       466232.639628

I learned this trick here. The key idea is to use numpy.lexsort.