Andre Elrico Andre Elrico - 1 year ago 64
R Question

Add n, mean and sd as numbers/numerics under boxplots with labels aligned under y-axis

Im aware of that thread, but the solution looks kinda long and complicated to me: Is there a quick and easy solution? To add the following vector under the y-axis?

yLabels <- c("","","n","mn","sd")

Reproducible data:


mtcars <- mtcars

values <- rbind(tapply(mtcars$mpg,mtcars$gear,length)) %>% rbind(tapply(mtcars$mpg,mtcars$gear,mean)) %>% rbind(tapply(mtcars$mpg,mtcars$gear,sd)) %>%

levels <- rbind(levels(mtcars$gear%>%factor),matrix("",ncol=ncol(values)))

xlabs <- rbind(levels,values) %>% apply(.,2,function(x) {paste(x,collapse="\n")})
ggplot(mtcars, aes(x=factor(gear), y=mpg, fill=factor(gear))) + geom_boxplot() + scale_x_discrete(labels=xlabs)

This is what the above code processes:

Missing Labels to explain numerics

This is what i want: Labels under and "in line" with the y-axis. For n, mean and sd

enter image description here

Answer Source

Not the nicest solution but gives you maybe an idea how to solve it in a general way...

xlabs <- c('2.5'='\n\nn\nmn\nsd', 
           rbind(levels,values) %>% apply(.,2,function(x) {paste(x,collapse="\n")})) 
ggplot(mtcars, aes(x=gear, y=mpg, fill=factor(gear))) + 
  geom_boxplot() + 
  scale_x_continuous(breaks=c(2.5,3,4,5), labels=xlabs) +
  theme(axis.ticks.x=element_line(color=c('white', rep('black', length(xlabs[-1])))))

I think a trick would be to use numeric values instead of the factor and you could add a tick label close to your limits. This one can than be labeled and the tick mark is white in the and...

Pretty hacky but I guess there is potential...


Ok, a little bit more general in case of factors:

mtcars$test <- as.factor(mtcars$gear)
xlabs <- 
    rbind(levels,values) %>% 
      apply(.,2,function(x) {paste(x,collapse="\n")})) 

ggplot(mtcars, aes(x=as.numeric(test), y=mpg, fill=factor(gear))) + 
  geom_boxplot() + 
  scale_x_continuous(breaks=c(0.5, seq(1,length(levels(mtcars$test)))), 
                     labels=xlabs) +
                                          rep('black', length(xlabs[-1])))))

Factors can be represented as.numeric and then these are basically integers starting at 1. So you can just use this to put them on the continuous scale and add an extra break at 0 or 0.5 and add your extra label to the xlabs variable. To hide the tick mark you can just add a white tick plus the number of levels in your column used for the x axis.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download