user3022875 user3022875 - 1 month ago 8
R Question

how to do geom_error with long format

I have this data frame "dat" that is in long format and the plot shows time series a and b with upper and lower error bars using a line graph but I would like to use geom_error() instead. How can the plot be changed to use geom_error() when dat is in this format?

The issue is geom_error takes a ymin and ymax

geom_errorbar(aes(ymax = Upper ,ymin = Lower), width = .25)


but when the data is in long format the ymin and max are in 4 time series; upper_a, lower_a, upper_b and lower_b

dat = data.frame(x= c(1,2,1,2,1,2,1,2,1,2,1,2),y = c(1,2,5,6,2,3,0,1,6,7,4,5), group = c("a","a","b","b","upper_a","upper_a","lower_a","lower_a",
"upper_b","upper_b","lower_b","lower_b"))
ggplot(data=dat , aes(x=as.factor(x), y=y,fill=group, group= group,
color = group )) +
geom_line() + geom_point()+
scale_fill_manual( name = "Metric", labels = c(
a = "a",
b = "b",
upper_a = "upper a",
lower_a = "lower a",
upper_b ="upper b",
lower_b= "lower b"),
values =c(
a = "red",
b = "blue",
upper_a = "lightpink",
lower_a = "lightpink",
upper_b ="lightsteelblue",
lower_b= "lightsteelblue")
) +
scale_color_manual( name = "Metric", labels = c(
a = "a",
b = "b",
upper_a = "upper a",
lower_a = "lower a",
upper_b ="upper b",
lower_b= "lower b"),
values =c(
a = "red",
b = "blue",
upper_a = "lightpink",
lower_a = "lightpink",
upper_b ="lightsteelblue",
lower_b= "lightsteelblue")
)

Answer

Generally, ggplot wants data to be in "long" format. However, to plot things like error terms, you usually need to bend this rule, and have separate "wide" columns for your main value and error value(s). So first, we can transform your existing data:

library(dplyr)
library(tidyr)

dat2 <- separate(dat, group, c('measure', 'group'), fill = 'right') %>% 
    mutate(group = ifelse(is.na(group), measure, group), measure = ifelse(measure %in% c('a', 'b'), 'value', measure)) %>% 
    spread(measure, y)

  x group lower upper value
1 1     a     0     2     1
2 1     b     4     6     5
3 2     a     1     3     2
4 2     b     5     7     6

And then plot the data using these new columns:

plot.new <- ggplot(data = dat2, aes(x = x, y = value, ymin = lower, ymax = upper, color = group)) +
    geom_point() +
    geom_line() +
    geom_errorbar(width = 1/5)
print(plot.new)

enter image description here

Incidentally, if the error is symmetric, you can re-use a single error column by using mathematical expressions in aes():

plot.new <- ggplot(data = dat2, aes(x = x, y = value, ymin = value - error, ymax = value + error, color = group)) + ...