Hefin Hefin - 3 months ago 36
R Question

How to add stat_summary line to coord_polar plot in ggplot2?

I have rosette plots arranged in a facet_grid, showing histogram data from a 2x2 factorial experiment. Mock data as generated here:

#GENERATE MOCK DATA-------------------------------------------------------------------------
Treatment <- c(rep("Vehicle", 50), rep("Drug", 50))
Cell <- c(rep("A", 25), rep("B", 25), rep("A", 25), rep("B", 25))
Response <- c(rnorm(25, 50, 120), rnorm(25, 30, 90), rnorm(25, 50, 120), rnorm(25, 30, 90))
Data <- data.frame(Treatment, Cell, Response)


I then generate the rosette plot like this:

#PLOT ROSETTES-------------------------------------------------------------------------------
library("ggplot2")
baseplot <- ggplot(data = Data, aes(x = Response, fill = Treatment))
baseplot + geom_bar(width = 4) + coord_polar() + facet_grid(Treatment~Cell) +
labs(y = "Frequency", x = "")


Here is an image of the plot (the actual plot is much more pleasing to look at, and for the purposes of this demonstration I'm ignoring the errors about overlapping bars).

I would like to add a line to each facet, radiating from the center outwards, marking the median of each combination of factors. I have tried using stat_summary to do this, along the lines of:

+ stat_summary(fun.y = "median", geom = "line)


but I get the following errors:

Warning messages:

1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

2: Computation failed in
stat_summary()
:
arguments imply differing number of rows: 1, 0

3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

4: Computation failed in
stat_summary()
:
arguments imply differing number of rows: 1, 0

5: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

6: Computation failed in
stat_summary()
:
arguments imply differing number of rows: 1, 0

7: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

8: Computation failed in
stat_summary()
:
arguments imply differing number of rows: 1, 0

I know there is probably a simple solution to this but I have always struggled to understand the syntax of stat_summary. If you can offer any help I would be greatful. I even don't mind calculating the medians manually first and adding them on.

Answer

There might be a good stat_summary answer possible, but I'm not seeing it as you would need access to the ..count.. produced by geom_bar. Also note that geom_line would need multiple points to draw a line, and median would only give one value anyway.

It seems easier to me to precalculate the different medians, and use geom_vline to add them to the plot. This is convenient to do with dplyr.

library(dplyr)
Data2 <- Data %>% 
  group_by(Cell, Treatment) %>% 
  summarize(v = median(Response))

Making the plot:

library(ggplot2)
baseplot <- ggplot(data = Data, aes(x = Response, fill = Treatment))
baseplot + geom_bar(width = 4) + coord_polar() + facet_grid(Treatment~Cell) +
  labs(y = "Frequency", x = "") + 
  geom_vline(data = Data2, aes(xintercept = v), size = 1.5)

Result:

enter image description here