Nudibranch - 1 year ago 96

R Question

I'm trying to learn QR decomposition, but can't figure out how to get the variance of beta_hat without resorting to traditional matrix calculations. I'm practising with the

`iris`

`y<-(iris$Sepal.Length)`

x<-(iris$Sepal.Width)

X<-cbind(1,x)

n<-nrow(X)

p<-ncol(X)

qr.X<-qr(X)

b<-(t(qr.Q(qr.X)) %*% y)[1:p]

R<-qr.R(qr.X)

beta<-as.vector(backsolve(R,b))

res<-as.vector(y-X %*% beta)

Thanks for your help!

Answer Source

Residual degree of freedom is `n - p`

*(More accurately, n - qr.X$rank. In the following, I will use qr.X$rank instead of p.)*, so estimated variance is

```
se2 <- sum(res ^ 2) / (n - qr.X$rank)
```

Thus, the variance covariance matrix of your estimated `beta_hat`

is just

```
chol2inv(R) * se2
# [,1] [,2]
#[1,] 0.22934170 -0.07352916
#[2,] -0.07352916 0.02405009
```

Very often, we only want marginal variance, i.e., the diagonals of this full variance-covariance matrix. In that case, there is no need to call `chol2inv`

to first explicitly form the full covariance matrix. We can compute diagonals directly at much lower computational costs:

```
Rinv <- backsolve(R, diag(qr.X$rank))
rowSums(Rinv ^ 2) * se2
# [1] 0.22934170 0.02405009
```

Let's check the correctness by comparing with `lm`

:

```
fit <- lm(Sepal.Length ~ Sepal.Width, iris)
vcov(fit)
# (Intercept) Sepal.Width
#(Intercept) 0.22934170 -0.07352916
#Sepal.Width -0.07352916 0.02405009
```

Yes, all correct!

Finally, some comments on your posted code:

Instead of

`b <- (t(qr.Q(qr.X)) %*% y)[1:p]`

you can just use

`b <- qr.qty(qr.X, y)[1:qr.X$rank]`

You don't have to extract

`R <- qr.R(qr.X)`

for`backsolve`

; using`qr.X$qr`

is sufficient:`beta <- as.vector(backsolve(qr.X$qr,b)) Rinv <- backsolve(qr.X$qr, diag(qr.X$rank))`

In summary, here is a complete session:

```
## data
y <- iris$Sepal.Length
x <- iris$Sepal.Width
## add intercept to model matrix
X <- cbind(1,x)
## QR factorization of model matrix
qr.X <- qr(X)
## get estimated coefficient
b <- qr.qty(qr.X, y)
beta <- as.vector(backsolve(qr.X$qr, b))
## compute residuals (I don't use `qr.resid` and `qr.fitted`, though I can)
res <- as.vector(y - X %*% beta)
## residual standard error
se2 <- sum(res ^ 2) / (nrow(X) - qr.X$rank)
## full variance-covariance matrix
chol2inv(qr.X$qr) * se2
```