ECII ECII - 3 months ago 7
R Question

Color of histogram bin based on the distribution of another variable

Ok this is a tricky one. It might not be possible.

test<-data.frame(var.a=c(1,1,1,1,2,2,2,3,3,3,3,3,4,4,5,5,5,5), var.b=c(1,2,1,3,2,3,4,3,2,2,1,2,1,2,3,4,1,2))


is it possible to color each bin of the hist(test$var.a) histogram based on the distribution of var.b? so that I can tell that in the bin 1 of hist(test$var.a) there there are 50% "ones", 25% "twos" and 24% "threes" of var.b? Some sort of stacked bars inside each bin?

I guess some kind of spinogram however the bars should not have the same height (since they represent the frequency of var.a) and inside each bar the frequency of var.b should be color coded.

Thanks a lot

Answer

ggplot2 has just what you're looking for:

test<-data.frame(var.a=c(1,1,1,1,2,2,2,3,3,3,3,3,4,4,5,5,5,5), var.b=c(1,2,1,3,2,3,4,3,2,2,1,2,1,2,3,4,1,2))

library(ggplot2)
qplot(test$var.a, binwidth = 1, fill = factor(test$var.b))
ggsave("stacked_histogram.pdf")

Stacked histogram