tosik - 1 year ago 55

R Question

I have very basic question regarding understanding linear regression model. Consider simple case when $y = a + bx + e$, where $e$ is the error term. I use OLS to estimate coefficients $a$ and $b$. Then fitted values are $\hat y = \hat a + \hat b x$. Should not they lie on the same line, since it is linear relationship? I ask because I do simple manipulations in R and have counterintuitive results

`x <- rnorm(20, 3, 1)`

y <- 12 + 4*x + rnorm(20, 0, 0.5)

m <- lm(y ~ x)

a <- coef(m)[1]

b = coef(m)[2]

plot(x, y) #plot initial data

abline(a = a, b = b, lwd = 2, col = 2) #plot fitted line

points(x = m$fitted.values, col = 4, pch = 4) #plot fitted values

legend('topleft', c("Actual", "Fitted line", "Fitted values"), col = c(1, 2, 4), pch = c(1, 1, 4), lty = c(0, 1, 0))

Why fitted values do not lie on the fitted line?

Answer Source

Replace the last line with

```
points(x = x, y = m$fitted.values, col = 4, pch = 4) #plot fitted values
```

The fitted values are for $y$, not for $x$.