Nudibranch Nudibranch - 4 months ago 22
R Question

How to calculate variance of least squares estimator using QR decomposition in R?

I'm trying to learn QR decomposition, but can't figure out how to get the variance of beta_hat without resorting to traditional matrix calculations. I'm practising with the

data set, and here's what I have so far:

b<-(t(qr.Q(qr.X)) %*% y)[1:p]
res<-as.vector(y-X %*% beta)

Thanks for your help!


enter image description here

Residual degree of freedom is n - p (More accurately, n - qr.X$rank. In the following, I will use qr.X$rank instead of p.), so estimated variance is

se2 <- sum(res ^ 2) / (n - qr.X$rank)

Thus, the variance covariance matrix of your estimated beta_hat is just

chol2inv(R) * se2

#           [,1]         [,2]
#[1,]  0.22934170 -0.07352916
#[2,] -0.07352916  0.02405009

Very often, we only want marginal variance, i.e., the diagonals of this full variance-covariance matrix. In that case, there is no need to call chol2inv to first explicitly form the full covariance matrix. We can compute diagonals directly at much lower computational costs:

Rinv <- backsolve(R, diag(qr.X$rank))
rowSums(Rinv ^ 2) * se2
# [1] 0.22934170 0.02405009

Let's check the correctness by comparing with lm:

fit <- lm(Sepal.Length ~ Sepal.Width, iris)


#            (Intercept) Sepal.Width
#(Intercept)  0.22934170 -0.07352916
#Sepal.Width -0.07352916  0.02405009

Yes, all correct!

Finally, some comments on your posted code:

  1. Instead of

    b <- (t(qr.Q(qr.X)) %*% y)[1:p]

    you can just use

    b <- qr.qty(qr.X, y)[1:qr.X$rank]
  2. You don't have to extract R <- qr.R(qr.X) for backsolve; using qr.X$qr is sufficient:

    beta <- as.vector(backsolve(qr.X$qr,b))
    Rinv <- backsolve(qr.X$qr, diag(qr.X$rank))

In summary, here is a complete session:

## data
y <- iris$Sepal.Length
x <- iris$Sepal.Width

## add intercept to model matrix
X <- cbind(1,x)

## QR factorization of model matrix
qr.X <- qr(X)

## get estimated coefficient
b <- qr.qty(qr.X, y)
beta <- as.vector(backsolve(qr.X$qr, b))

## compute residuals (I don't use `qr.resid` and `qr.fitted`, though I can)
res <- as.vector(y - X %*% beta)

## residual standard error
se2 <- sum(res ^ 2) / (nrow(X) - qr.X$rank)

## full variance-covariance matrix
chol2inv(qr.X$qr) * se2