user5364303 user5364303 - 3 months ago 18
R Question

Evaluating expression inside ggplot2

I am trying to better understand ggplot2, so while I am looking for a way to accomplish the task below, I would also appreciate an explanation of why it does not currently work.
So far I could not find information on the topic.

Both of my questions are about using expressions inside ggplot2.

I have a data.frame

set.seed(1)
DF <- data.frame(A = 1:24, B = LETTERS[rep(1:4,6)], C = rep(1:3,8))

head(DF, n = 9)

# A B C
#1 1 A 1
#2 2 B 2
#3 3 C 3
#4 4 D 1
#5 5 A 2
#6 6 B 3
#7 7 C 1
#8 8 D 2
#9 9 A 3


I want to plot the mean value of the column A, grouped by the values in B without transforming my data.
I would expect that it is possible to do something like the following:

ggplot(DF) + geom_point(aes(x = B , y = mean(A), group = B))


but that returns the following
ggplot2 plots universal mean, not grouped mean
where mean(A) is the same for all values of B.

How could I go about plotting this without transforming my data?

Another barrier which I find myself up against from time to time is trying to put an expression inside a facet_grid() or facet_wrap()

For example, say I want to use modular division to make a new temporary column like so to facet by later:

DF$A %% 4
1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0


I could tack this column onto my data frame. But let's impose a restriction that I cannot transform my data.
I would have expected that I could do something like this:

ggplot(DF)+geom_point(aes(x = B, y = C)) + facet_grid({A %% 4}~.)


or

ggplot(DF)+geom_point(aes(x = B, y = C, group = A)) + facet_grid({A %% 4} ~ .)


or even

ggplot(DF)+geom_point(aes(x = B, y = C)) + facet_grid(formula({A %% 4} ~.))


but they all return the error

Error in layout_base(data, rows, drop = drop) :
At least one layer must contain all variables used for facetting


Could anyone explain to me in a way that reveals the way that ggplot2 works why these attempts fail and how I might get the desired results without transforming the data?

Answer

Why does your plot only have one y value? Because mean(DF$A) only produces one value.

If you want to do a transformation, you'll have to use a stat_* function. That is exactly what they are supposed to do.

In this case:

ggplot(DF, aes(x = B , y = A, group = B)) + 
  stat_summary(fun.y = 'mean', geom = 'point')

Or the equivalent:

ggplot(DF, aes(x = B , y = A, group = B)) + 
  geom_point(stat = 'summary', fun.y = 'mean')

I don't see a way to do facetting on non-existing columns.