user3552144 user3552144 - 3 months ago 21
R Question

Having trouble plotting multiple data sets and their confidence intervals on the same GGplot. Data Frame included

First off, here is my data frame:

> df.combined
MLSupr MLSpred MLSlwr BPLupr BPLpred BPLlwr
1 1.681572 1.392213 1.102854 1.046068 0.8326201 0.6191719
2 3.363144 2.784426 2.205708 2.112885 1.6988250 1.2847654
3 5.146645 4.232796 3.318946 3.201504 2.5999694 1.9984346
4 6.930146 5.681165 4.432184 4.368555 3.6146180 2.8606811
5 8.713648 7.129535 5.545422 5.480557 4.5521112 3.6236659
6 10.497149 8.577904 6.658660 6.592558 5.4896044 4.3866506
7 12.280651 10.026274 7.771898 7.681178 6.3907488 5.1003198
8 14.064152 11.474644 8.885136 8.924067 7.4889026 6.0537381
9 15.847653 12.923013 9.998373 10.125539 8.5444783 6.9634176
10 17.740388 14.429805 11.119222 11.327011 9.6000541 7.8730970
11 19.633122 15.936596 12.240071 12.620001 10.7425033 8.8650055
12 21.525857 17.443388 13.360919 13.821473 11.7980790 9.7746850
13 23.535127 19.010958 14.486789 15.064362 12.8962328 10.7281032
14 25.544397 20.578528 15.612659 16.307252 13.9943865 11.6815215
15 27.553667 22.146098 16.738529 17.600241 15.1368357 12.6734300
16 29.562937 23.713668 17.864399 18.893231 16.2792849 13.6653384
17 31.572207 25.281238 18.990268 20.245938 17.4678163 14.6896948
18 33.581477 26.848807 20.116138 21.538928 18.6102655 15.6816033
19 35.590747 28.416377 21.242008 22.891634 19.7987969 16.7059597
20 37.723961 30.047177 22.370394 24.313671 21.0352693 17.7568676


So, as you can see, i have predicted values along with the upper and lower bounds of their 95% CI. I'd like to plot the lines and their ribbons for MLS and BPL in the same plot but i'm not quite sure how.
Right now, for a single data set, I am using this command:

ggplot(BULISeason, aes(x = 1:length(BULISeason$`Running fit`), y = `Running fit`)) +
geom_line(aes(fill = "black")) +
geom_ribbon(aes(ymin = `Running lwr`, ymax = `Running upr`, fill = "red"),alpha = 0.25)


Note: The variables are different for the independent data frames.

Answer

You can, of course, construct your plots as a series of layers like you imply in your question. For that you can use the following code:

ggplot(data = df.combined) +
geom_ribbon(aes(x = x, ymin = MLSlwr, ymax = MLSupr), 
            fill = "blue", alpha = 0.25) +
geom_line(aes(x = x, y = MLSpred), color = "black") +
geom_ribbon(aes(x = x, ymin = BPLlwr, ymax = BPLupr), 
            fill = "red", alpha = 0.25) +
geom_line(aes(x = x, y = BPLpred), color = "black")

and obtain something like this: plot1

However, reshaphing your dataset to a "tidy", or long format, has some advantages. For example you could map the origin of the predictions into a color and the type of prediction into line types in the resulting plot:

plot2

You can achieve that using the following code:

library(tidyr)

tidy.data  <- df.combined %>% 
  # add id variable
  mutate(x = 1:20) %>% 
  # reshape to long format
  gather("variable", "value", 1:6) %>% 
  # separate variable names at position 3
  separate(variable, 
           into = c("model", "line"), 
           sep = 3, 
           remove = TRUE)

# plot
ggplot(data = tidy.data, aes(x        = x, 
                             y        = value, 
                             linetype = line, 
                             color    = model)) + 
  geom_line() + 
  scale_linetype_manual(values = c("dashed", "solid", "dashed"))

You can still use ribbons in your plot by spreading your dataframe back to a wide(r) format:

# back to wide
wide.data <- tidy.data %>% 
  spread(line, value)

# plot with ribbon
ggplot(data = wide.data, aes(x = x, y = pred)) +
  geom_ribbon(aes(ymin = lwr, ymax = upr, fill = model), alpha = .5) +
  geom_line(aes(group = model))

plot3

Hope this helps!

Comments