tattybojangler - 6 months ago 54

R Question

I am trying to get the Lower Triangular Cholesky Decomposition of the following matrix in R using the

`chol()`

`x <- matrix(c(4,2,-2, 2,10,2, -2,2,5), ncol = 3, nrow = 3)`

Q <- chol(x)

Q

# [,1] [,2] [,3]

# [1,] 2 1 -1.000000

# [2,] 0 3 1.000000

# [3,] 0 0 1.732051

I basically need to find the matrix

`Q`

`QQ' = X`

Answer

We can use `t()`

to transpose the resulting upper triangular to get lower triangular:

```
x <- matrix(c(4,2,-2, 2,10,2, -2,2,5), ncol = 3, nrow = 3)
R <- chol(x) ## upper tri
L <- t(R) ## lower tri
all.equal(crossprod(R), x) ## t(R) %*% R
# [1] TRUE
all.equal(tcrossprod(L), x) ## L %*% t(L)
# [1] TRUE
```

But, I think you are not the only one who has such doubt. So I will elaborate more on this.

`chol.default`

from R base calls LAPACK routine `dpotrf`

for non-pivoted Cholesky factorization, and LAPACK routine `dpstrf`

for pivoted Cholesky factorization. Both LAPACK routines allow users to choose whether to work with upper triangular or lower triangular, but R disables the lower triangular option and only returns upper triangular. See the source code:

```
chol.default
#function (x, pivot = FALSE, LINPACK = FALSE, tol = -1, ...)
#{
# if (is.complex(x))
# stop("complex matrices not permitted at present")
# .Internal(La_chol(as.matrix(x), pivot, tol))
#}
#<bytecode: 0xb5694b8>
#<environment: namespace:base>
// from: R_<version>/src/modules/lapack.c
static SEXP La_chol(SEXP A, SEXP pivot, SEXP stol)
{
// ...omitted part...
if(!piv) {
int info;
F77_CALL(dpotrf)("Upper", &m, REAL(ans), &m, &info);
if (info != 0) {
if (info > 0)
error(_("the leading minor of order %d is not positive definite"),
info);
error(_("argument %d of Lapack routine %s had invalid value"),
-info, "dpotrf");
}
} else {
double tol = asReal(stol);
SEXP piv = PROTECT(allocVector(INTSXP, m));
int *ip = INTEGER(piv);
double *work = (double *) R_alloc(2 * (size_t)m, sizeof(double));
int rank, info;
F77_CALL(dpstrf)("U", &m, REAL(ans), &m, ip, &rank, &tol, work, &info);
if (info != 0) {
if (info > 0)
warning(_("the matrix is either rank-deficient or indefinite"));
else
error(_("argument %d of Lapack routine %s had invalid value"),
-info, "dpstrf");
}
// ...omitted part...
return ans;
}
```

As you can see, it passes "Upper" or "U" to LAPACK.

The reason that upper triangular factor is more commonly seen in statistics, is that we often compare Cholesky factorization with QR factorization in least squares computation, while the latter only returns upper triangular. Requiring `chol.default`

to always return upper triangular offers code reuse. For example, the `chol2inv`

function is used to find unscaled covariance of least square estimate, where we can feed it either the result of `chol.default`

or `qr.default`

. See How to calculate variance of least squares estimator using QR decomposition in R? for details.