Perceptron Perceptron -4 years ago 284
R Question

ggplot boxplot multiple columns with a factor with condition

Example data frame:

a <- c(1, 0, 1)
b <- c(0, 1, 0)
c <- c(1, 0, 1)
total <- c(100,200,300)
my.data <- data.frame(a, b, c, total)

> my.data
a b c total
1 1 0 1 100
2 0 1 1 200
3 1 0 1 300


I would like to create one single boxplot to show the distribution of "total" for each column: a, b, c but only consider those with value = 1.
Example: Column a's row 2 is ignore because it is 0, so column a has a distribution of 100 and 300. Column B has a distribution of 200 and column c has a distribution of 100,200,300.

I can plot them separately:

ggplot(subset(my.data,a==1), aes(x=a,y=total)) +
geom_boxplot()

ggplot(subset(my.data,b==1), aes(x=b,y=total)) +
geom_boxplot()

ggplot(subset(my.data,c==1), aes(x=c,y=total)) +
geom_boxplot()


I also tried the following, but it's not correct:

ggplot(my.data, aes(x=as.factor(c("a","b","c")),y=total)) +
geom_boxplot()


Hoping there is an awesome R function/method that let me do my plot in one shot. Don't think I can use melt() because of the Total column. Thanks in advance.

Answer Source

Your data should be in long format, using the package Reshape2, for example

library(reshape2)
my.data <- melt(my.data, measure.vars=c("a","b","c"))

ggplot(subset(my.data, value==1), aes(x=variable,y=total)) + 
geom_boxplot() 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download