amarillion amarillion - 1 month ago 8
R Question

R ggplot2: using stat_summary (mean) and logarithmic scale

I have a bunch of measurements over time and I want to plot them in R. Here is a sample of my data. I've got 6 measurements for each of 4 time points:

values <- c (1012.0, 1644.9, 837.0, 1200.9, 1652.0, 981.5,
2236.9, 1697.5, 2087.7, 1500.8,
2789.3, 1502.9, 2051.3, 3070.7, 3105.4,
2692.5, 1488.5, 1978.1, 1925.4, 1524.3,
2772.0, 1355.3, 2632.4, 2600.1)
time <- factor (rep (c(0, 12, 24, 72), c(6, 6, 6, 6)))


The scale of these data is arbitrary, and in fact I'm going to normalize it so that the average of t=0 is 1.

norm <- values / mean (values[time == 0])


So far so good. Using
ggplot
, I plot both the individual points, as well as a line that goes through the average at each time point:

require (ggplot2)
p <- ggplot(data = data.frame(time, norm), mapping = aes (x = time, y = norm)) +
stat_summary (fun.y = mean, geom="line", mapping = aes (group = 1)) +
geom_point()


However, now I want to apply a logarithmic scale, and this is where my trouble starts. When I do:

q <- ggplot(data = data.frame(time, norm), mapping = aes (x = time, y = norm)) +
stat_summary (fun.y = mean, geom="line", mapping = aes (group = 1)) +
geom_point() +
scale_y_log2()


The line does NOT go through 0 at t=0, as you would expect because log (1) == 0. Instead the line crosses the y-axis slightly below 0. Apparently,
ggplot
applies the mean after log transformation, which gives a different result. I want it to take the mean before log transformation.

How can I tell
ggplot
to apply the mean first? Is there a better way to create this chart?

Answer

scale_y_log2() will do the transformation first and then calculate the geoms.

coord_trans() will do the opposite: calculate the geoms first, and the transform the axis.

So you need coord_trans(ytrans = "log2") instead of scale_y_log2()