agenis - 1 year ago 120
R Question

boxplot: order groups by the mean of a subset of each group

Let's consider this data:

``````df = data.frame('score'=round(runif(15, 1, 10)),
'group'=paste0("a",rep(c(1,2,3),each=5)),
'category'=rep(c("big", "big", "big", "big", "small"), 3))
``````

I would like to plot boxplots of this data with
`ggplot2`
. What i want is: boxplot(score~group), but with the boxplots arranged according to the mean of the "big" individuals of each group.

I can't figure it out in a simple way, without creating new variables. OK to use Dplyr. Thanks.

I don't know if this qualifies as a simple way, I personally find it simple, but I use `dplyr` to find the means:

``````#find the means for each group
library(dplyr)
means <-
df %>%
#filter out small since you only need category equal to 'big'
filter(category=='big') %>%
#use the same groups as in the ggplot
group_by(group) %>%
#calculate the means
summarise(mean = mean(score))

#order the groups according to the order of the means
myorder <- means\$group[order(means\$mean)]
``````

In this case the order is:

``````> myorder
[1] a1 a2 a3
``````

In order to arrange the order of the boxplots according to the above you just need to do:

``````library(ggplot2)
ggplot(df, aes(group, score)) +
geom_boxplot() +
#you just need to use scale_x_discrete with the limits argument
#to pass in details of the order of appearance for the boxplots
#in this case the order is the myorders vector
scale_x_discrete(limits=myorder)
``````

And that's it.

