DeltaIV - 7 months ago 51

R Question

I was unsure whether this question would be more appropriate here or on Cross Validated. I hope I made the right choice.

Consider the example:

`library(dplyr)`

setosa <- iris %>% filter(Species == "setosa") %>% select(Sepal.Length, Sepal.Width, Species)

library(ggplot2)

ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +

geom_point() +

geom_smooth(method ="lm", formula = y ~ poly(x,2))

By default,

`ggplot`

`ggplot2`

`predict`

Answer

One way to check what `predict.lm()`

computes is to inspect the code (`predict`

multiplies standard errors by `qt((1 - level)/2, df)`

, and so does not appear to make adjustments for simultaneous inference). Another way is to construct simultaneous confidence intervals and compare them against `predict`

's intervals.

Fit the model and construct simultaneous confidence intervals:

```
setosa <- subset(iris, Species == "setosa")
setosa <- setosa[order(setosa$Sepal.Length), ]
fit <- lm(Sepal.Width ~ poly(Sepal.Length, 2), setosa)
K <- cbind(1, poly(setosa$Sepal.Length, 2))
cht <- multcomp::glht(fit, linfct = K)
cci <- confint(cht)
```

Reshape and plot:

```
cc <- as.data.frame(cci$confint)
cc$Sepal.Length <- setosa$Sepal.Length
cc <- reshape2::melt(cc[, 2:4], id.var = "Sepal.Length")
library(ggplot2)
ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_smooth(method ="lm", formula = y ~ poly(x,2)) +
geom_line(data = cc,
aes(x = Sepal.Length, y = value, group = variable),
colour = "red")
```

It appears that `predict(.., interval = "confidence")`

does not produce simultaneous confidence intervals: