Nudibranch - 2 months ago 15
R Question

# How to calculate variance of least squares estimator using QR decomposition in R?

I'm trying to learn QR decomposition, but can't figure out how to get the variance of beta_hat without resorting to traditional matrix calculations. I'm practising with the

`iris`
data set, and here's what I have so far:

``````y<-(iris\$Sepal.Length)
x<-(iris\$Sepal.Width)
X<-cbind(1,x)
n<-nrow(X)
p<-ncol(X)
qr.X<-qr(X)
b<-(t(qr.Q(qr.X)) %*% y)[1:p]
R<-qr.R(qr.X)
beta<-as.vector(backsolve(R,b))
res<-as.vector(y-X %*% beta)
``````

Thanks for your help!

Residual degree of freedom is `n - p` (More accurately, `n - qr.X\$rank`. In the following, I will use `qr.X\$rank` instead of `p`.), so estimated variance is

``````se2 <- sum(res ^ 2) / (n - qr.X\$rank)
``````

Thus, the variance covariance matrix of your estimated `beta_hat` is just

``````chol2inv(R) * se2

#           [,1]         [,2]
#[1,]  0.22934170 -0.07352916
#[2,] -0.07352916  0.02405009
``````

Very often, we only want marginal variance, i.e., the diagonals of this full variance-covariance matrix. In that case, there is no need to call `chol2inv` to first explicitly form the full covariance matrix. We can compute diagonals directly at much lower computational costs:

``````Rinv <- backsolve(R, diag(qr.X\$rank))
rowSums(Rinv ^ 2) * se2
# [1] 0.22934170 0.02405009
``````

Let's check the correctness by comparing with `lm`:

``````fit <- lm(Sepal.Length ~ Sepal.Width, iris)

vcov(fit)

#            (Intercept) Sepal.Width
#(Intercept)  0.22934170 -0.07352916
#Sepal.Width -0.07352916  0.02405009
``````

Yes, all correct!

Finally, some comments on your posted code:

``````b <- (t(qr.Q(qr.X)) %*% y)[1:p]
``````

you can just use

``````b <- qr.qty(qr.X, y)[1:qr.X\$rank]
``````
2. You don't have to extract `R <- qr.R(qr.X)` for `backsolve`; using `qr.X\$qr` is sufficient:

``````beta <- as.vector(backsolve(qr.X\$qr,b))
Rinv <- backsolve(qr.X\$qr, diag(qr.X\$rank))
``````

In summary, here is a complete session:

``````## data
y <- iris\$Sepal.Length
x <- iris\$Sepal.Width

## add intercept to model matrix
X <- cbind(1,x)

## QR factorization of model matrix
qr.X <- qr(X)

## get estimated coefficient
b <- qr.qty(qr.X, y)
beta <- as.vector(backsolve(qr.X\$qr, b))

## compute residuals (I don't use `qr.resid` and `qr.fitted`, though I can)
res <- as.vector(y - X %*% beta)

## residual standard error
se2 <- sum(res ^ 2) / (nrow(X) - qr.X\$rank)

## full variance-covariance matrix
chol2inv(qr.X\$qr) * se2
``````