tattybojangler tattybojangler - 29 days ago 17
R Question

chol() function in R keeps returning Upper Triangular (I need Lower Triangular)

I am trying to get the Lower Triangular Cholesky Decomposition of the following matrix in R using the

chol()
function. However, it keeps returning the Upper Triangular Decomposition and I can't seem to find a way to get the Lower Triangular Decomposition, even after looking through the documentation. The following is the code that I am using -

x <- matrix(c(4,2,-2, 2,10,2, -2,2,5), ncol = 3, nrow = 3)
Q <- chol(x)
Q
# [,1] [,2] [,3]
# [1,] 2 1 -1.000000
# [2,] 0 3 1.000000
# [3,] 0 0 1.732051


I basically need to find the matrix
Q
such that
QQ' = X
. Thanks for your help!

Answer

We can use t() to transpose the resulting upper triangular to get lower triangular:

x <- matrix(c(4,2,-2, 2,10,2, -2,2,5), ncol = 3, nrow = 3)
R <- chol(x)  ## upper tri
L <- t(R)  ## lower tri
all.equal(crossprod(R), x)  ## t(R) %*% R
# [1] TRUE
all.equal(tcrossprod(L), x)  ## L %*% t(L)
# [1] TRUE

But, I think you are not the only one who has such doubt. So I will elaborate more on this.

chol.default from R base calls LAPACK routine dpotrf for non-pivoted Cholesky factorization, and LAPACK routine dpstrf for pivoted Cholesky factorization. Both LAPACK routines allow users to choose whether to work with upper triangular or lower triangular, but R disables the lower triangular option and only returns upper triangular. See the source code:

chol.default
#function (x, pivot = FALSE, LINPACK = FALSE, tol = -1, ...) 
#{
#    if (is.complex(x)) 
#        stop("complex matrices not permitted at present")
#    .Internal(La_chol(as.matrix(x), pivot, tol))
#}
#<bytecode: 0xb5694b8>
#<environment: namespace:base>

// from: R_<version>/src/modules/lapack.c
static SEXP La_chol(SEXP A, SEXP pivot, SEXP stol)
{
  // ...omitted part...
if(!piv) {
int info;
F77_CALL(dpotrf)("Upper", &m, REAL(ans), &m, &info);
if (info != 0) {
    if (info > 0)
    error(_("the leading minor of order %d is not positive definite"),
          info);
    error(_("argument %d of Lapack routine %s had invalid value"),
      -info, "dpotrf");
}
} else {
double tol = asReal(stol);
SEXP piv = PROTECT(allocVector(INTSXP, m));
int *ip = INTEGER(piv);
double *work = (double *) R_alloc(2 * (size_t)m, sizeof(double));
int rank, info;
F77_CALL(dpstrf)("U", &m, REAL(ans), &m, ip, &rank, &tol, work, &info);
if (info != 0) {
    if (info > 0)
    warning(_("the matrix is either rank-deficient or indefinite"));
    else
    error(_("argument %d of Lapack routine %s had invalid value"),
          -info, "dpstrf");
}
// ...omitted part...
return ans;
}

As you can see, it passes "Upper" or "U" to LAPACK.

The reason that upper triangular factor is more commonly seen in statistics, is that we often compare Cholesky factorization with QR factorization in least squares computation, while the latter only returns upper triangular. Requiring chol.default to always return upper triangular offers code reuse. For example, the chol2inv function is used to find unscaled covariance of least square estimate, where we can feed it either the result of chol.default or qr.default. See How to calculate variance of least squares estimator using QR decomposition in R? for details.