giacomoV giacomoV - 22 days ago 15
R Question

R - ggplot geom_smooth facet_grid CI not showing

I am having a hard time understanding why the confidence intervals are not showing with my data. When I reproduce my code on another dataset, the code seems to work fine. For example, on

mtcars


The code is

mtols = mtcars %>% group_by(am) %>% do(lm0 = lm(disp ~ mpg*gear + vs, data=.)) %>%
augment(., lm0) %>%
mutate(ymin=.fitted-1.96*.se.fit, ymax=.fitted+1.96*.se.fit)


To generate the plot

mtols %>% ggplot(aes(mpg, .fitted) ) +
geom_smooth(data = mtols, aes(mpg, .fitted, group = gear, colour = gear, fill= gear), method="lm") +
theme_minimal() + facet_grid(~am)


I get the confidence intervals.

However this doesn't work with my data. Could someone help me figure out what goes wrong here ? I would be very grateful.

I compute the
OLS
with

dt = new %>% group_by(day) %>% do(lm0 = lm(y ~ year*class, data=.)) %>% augment(., lm0) %>%
mutate(ymin=.fitted-1.96*.se.fit, ymax=.fitted+1.96*.se.fit)

dt$year = as.numeric(as.character(dt$year))


The plot, (this is an example with few cases, but the results is the same with the whole dataset)

dt %>% ggplot(aes(year, .fitted) ) +
geom_smooth(data = dt, aes(year, .fitted, group = class, colour = class, fill= class), method="lm") +
theme_bw() + facet_grid(~day)


The
CI
are not showing.

enter image description here

Any clue what I am doing wrong here ?

Strangely, when I don't use the
facet_grid
here, the
CI
work perfectly

enter image description here

dt %>% ggplot(aes(year, .fitted) ) +
geom_smooth(data = dt, aes(year, .fitted, group = class, colour = class, fill= class), method="lm") +
theme_bw()


A sample of my data

library(broom)
library(dplyr)
library(ggplot2)

new = structure(list(id = structure(c(844084L, 114510L, 14070410L,
942483L, 13190105L, 421369L, 301384L, 251789L, 11011210L, 11280408L,
278575L, 310410L, 16260105L, 11110815L, 18260101L, 14260501L,
10580L, 15090210L, 19140410L, 13230615L, 246511L, 20040812L,
14260114L, 287623L, 16090620L, 20131007L, 835743L, 453390L, 395808L,
363617L), label = "Household identifier", class = c("labelled",
"integer")), year = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L,
2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 1L), .Label = c("2000", "2015"), class = "factor"),
day = c("Weekend", "Weekend", "Weekend", "Weekdays", "Weekdays",
"Weekend", "Weekdays", "Weekend", "Weekend", "Weekdays",
"Weekend", "Weekdays", "Weekdays", "Weekend", "Weekend",
"Weekdays", "Weekdays", "Weekend", "Weekdays", "Weekdays",
"Weekdays", "Weekend", "Weekend", "Weekend", "Weekend", "Weekend",
"Weekend", "Weekdays", "Weekdays", "Weekdays"), class = structure(c(1L,
1L, 2L, 2L, 1L, 2L, 2L, 4L, 2L, 2L, 3L, 2L, 1L, 4L, 1L, 3L,
2L, 3L, 2L, 4L, 2L, 1L, 3L, 2L, 1L, 4L, 3L, 2L, 4L, 1L), .Label = c("Higher Managerial",
"Lower Managerial", "Intermediate", "Manual and Routine"), class = "factor"),
y = c(270, 730, 180, 0, 0, 290, 90, 650, 510, 0, 10, 200,
200, 180, 0, 0, 140, 260, 110, 740, 260, 0, 390, 610, 0,
0, 500, 0, 10, 170)), class = "data.frame", row.names = c(NA,
-30L), .Names = c("id", "year", "day", "class", "y"))

Answer

The confidence intervals are being drawn. We can't see them because there are only two unique points for each day.

dt2 <- dt %>% filter(class == "Higher Managerial")
plot(.fitted ~ year, data=subset(dt2, day=="Weekend"))

enter image description here

The reason we see intervals without the facet is because there is a wider interval when there are four points.

enter image description here

When we do not break out by facet, there are enough points to have some range in the confidence. But the confidence interval of two points has no range.

confint(lm(.fitted ~ year, data=subset(dt2, day=="Weekdays")))
#                     2.5 %      97.5 %
#   (Intercept) 9503.333333 9503.333333
# year            -4.666667   -4.666667

Edit

Here is a version where we use the ymin and ymax that were originally calculated, and plot it with geom_ribbon.

dt %>% ggplot(aes(year, .fitted,group = class, colour = class, fill= class)) + 
  geom_line() +
  geom_ribbon(aes(ymin=ymin, ymax=ymax), alpha=0.2) + 
  theme_bw() + facet_grid(~day) 

enter image description here