Hefin - 7 months ago 84

R Question

I have rosette plots arranged in a facet_grid, showing histogram data from a 2x2 factorial experiment. Mock data as generated here:

`#GENERATE MOCK DATA-------------------------------------------------------------------------`

Treatment <- c(rep("Vehicle", 50), rep("Drug", 50))

Cell <- c(rep("A", 25), rep("B", 25), rep("A", 25), rep("B", 25))

Response <- c(rnorm(25, 50, 120), rnorm(25, 30, 90), rnorm(25, 50, 120), rnorm(25, 30, 90))

Data <- data.frame(Treatment, Cell, Response)

I then generate the rosette plot like this:

`#PLOT ROSETTES-------------------------------------------------------------------------------`

library("ggplot2")

baseplot <- ggplot(data = Data, aes(x = Response, fill = Treatment))

baseplot + geom_bar(width = 4) + coord_polar() + facet_grid(Treatment~Cell) +

labs(y = "Frequency", x = "")

Here is an image of the plot (the actual plot is much more pleasing to look at, and for the purposes of this demonstration I'm ignoring the errors about overlapping bars).

I would like to add a line to each facet, radiating from the center outwards, marking the median of each combination of factors. I have tried using stat_summary to do this, along the lines of:

`+ stat_summary(fun.y = "median", geom = "line)`

but I get the following errors:

Warning messages:

1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

2: Computation failed in

`stat_summary()`

arguments imply differing number of rows: 1, 0

3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

4: Computation failed in

`stat_summary()`

arguments imply differing number of rows: 1, 0

5: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

6: Computation failed in

`stat_summary()`

arguments imply differing number of rows: 1, 0

7: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

8: Computation failed in

`stat_summary()`

arguments imply differing number of rows: 1, 0

I know there is probably a simple solution to this but I have always struggled to understand the syntax of stat_summary. If you can offer any help I would be greatful. I even don't mind calculating the medians manually first and adding them on.

Answer

There might be a good `stat_summary`

answer possible, but I'm not seeing it as you would need access to the `..count..`

produced by `geom_bar`

. Also note that `geom_line`

would need multiple points to draw a line, and `median`

would only give one value anyway.

It seems easier to me to precalculate the different medians, and use `geom_vline`

to add them to the plot. This is convenient to do with `dplyr`

.

```
library(dplyr)
Data2 <- Data %>%
group_by(Cell, Treatment) %>%
summarize(v = median(Response))
```

Making the plot:

```
library(ggplot2)
baseplot <- ggplot(data = Data, aes(x = Response, fill = Treatment))
baseplot + geom_bar(width = 4) + coord_polar() + facet_grid(Treatment~Cell) +
labs(y = "Frequency", x = "") +
geom_vline(data = Data2, aes(xintercept = v), size = 1.5)
```

Result: