rishi rishi - 3 months ago 12
R Question

How to plot multiple categorical variables in R

My data set contains several categorical variables that I would like visualise to see the distribution.

For example, if I wanted to visualise the 4 variables (manufacturer, trans, fl, class) in the mpg data set in ggplot2, I have to write 4 lines of code:

ggplot(mpg, aes(manufacturer)) + geom_bar() + coord_flip()
ggplot(mpg, aes(trans)) + geom_bar() + coord_flip()
ggplot(mpg, aes(fl)) + geom_bar() + coord_flip()
ggplot(mpg, aes(class)) + geom_bar() + coord_flip()


Resulting barplot:

enter image description here

How can I write a code to do this more efficiently? loop? apply function? I would like to see each chart one at a time, if possible.

Answer

Your idea to use lapply is one solution.

This requires aes_string to be used instead aes.

Single plots

This creates single plots per column (name) you supply as first argument to lapply:

lapply(c("manufacturer", "trans", "fl", "class"),
  function(col) {
    ggplot(mpg, aes_string(col)) + geom_bar() + coord_flip()
  })

Combined plots

If you require all plots on one plotting area, you can use miscset::ggplotGrid:

library(miscset) # install from CRAN if required
ggplotGrid(ncol = 2,
  lapply(c("manufacturer", "trans", "fl", "class"),
    function(col) {
        ggplot(mpg, aes_string(col)) + geom_bar() + coord_flip()
    }))

The result looks like:

enter image description here

Comments