manuelq manuelq - 6 months ago 38
R Question

Why does a boxplot in ggplot requires axis x and y?

I have a variable ceroonce which is number of schools per county (integers) in 2011. When I plot it with

it only requires the ceroonce variable. A boxplot is then retrieved in which the y axis is the number of schools and the x axis is... the "factor" ceroonce. But in
, when using
, it requires me to input both x and y axis, but I just want a boxplot of ceroonce. I have tried inputing ceroonce as both the x and y axis. But then a weird boxplot is retrieved in which the y axis is the number of schools but the x axis (which should be the factor variable) is also the number of schools? I am assuming this is very basic statistics, but I am just confused. I am attaching the images hoping this will clarify my question.

This is the code I am using:

ggplot(escuelas, aes(x=ceroonce, y=ceroonce))+geom_boxplot()


There are no fancy statistics happening here. boxplot is simply assuming that since you've given it a single vector, that you want a single box in your boxplot. ggplot and geom_histogram simply don't make that assumption.

If you want a bit less typing, you can do this:

qplot(y=escuelas$ceroonce, x= 1, geom = "boxplot")

ggplot2 will automatically create a vector of 1s equal in length to the length of escuelas$ceroonce