Fred S - 1 year ago 221
Python Question

# How can I sort a boxplot in pandas by the median values?

I want to draw a boxplot of column

`Z`
in dataframe
`df`
by the categories
`X`
and
`Y`
. How can I sort the boxplot by the median, in descending order?

``````import pandas as pd
import random
n = 100
# this is probably a strange way to generate random data; please feel free to correct it
df = pd.DataFrame({"X": [random.choice(["A","B","C"]) for i in range(n)],
"Y": [random.choice(["a","b","c"]) for i in range(n)],
"Z": [random.gauss(0,1) for i in range(n)]})
df.boxplot(column="Z", by=["X", "Y"])
``````

Note that this question is very similar, but they use a different data structure. I'm relatively new to pandas (and have only done some tutorials on python in general), so I couldn't figure out how to make my data work with the answer posted there. This may well be more of a reshaping than a plotting question. Maybe there is a solution using
`groupby`
?

You can use the answer in How to sort a boxplot by the median values in pandas but first you need to group your data and create a new data frame:

``````import pandas as pd
import random
import matplotlib.pyplot as plt

n = 100
# this is probably a strange way to generate random data; please feel free to correct it
df = pd.DataFrame({"X": [random.choice(["A","B","C"]) for i in range(n)],
"Y": [random.choice(["a","b","c"]) for i in range(n)],
"Z": [random.gauss(0,1) for i in range(n)]})
grouped = df.groupby(["X", "Y"])

df2 = pd.DataFrame({col:vals['Z'] for col,vals in grouped})

meds = df2.median()
meds.sort(ascending=False)
df2 = df2[meds.index]
df2.boxplot()

plt.show()
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download