snarble snarble - 2 months ago 9
R Question

Reproducing Singular Value Decompition in R

I have an example word by document matrix (from Landauer and Dumais, 1997):

wxd <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,
0,0,1,1,1,1,1,0,1,0,0,0,
0,1,0,1,1,0,0,1,0,0,0,0,
1,0,0,0,2,0,0,1,0,0,0,0,
0,0,0,1,0,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,
0,0,0,0,0,0,0,0,0,1,1,0,
0,0,0,0,0,0,0,0,0,1,1,1,
0,0,0,0,0,0,0,0,1,0,1,1)
,12, 9)
rownames(wxd) <- c("human", "interface", "computer", "user", "system",
"response", "time", "EPS", "survey", "trees", "graph", "minors")
colnames(wxd) <- c(paste0("c", 1:5), paste0("m", 1:4))


I can perform Singular Value Decomposition on this matrix using the
svd()
function and have three matrices
U
,
S
, and
V
:

SVD <- svd(wxd)
U <- SVD$u
S <- diag(SVD$d)
V <- SVD$v


I can multiply these matrices and get my original matrix returned (within some small margin or error):

U %*% S %*% t(V)


I can also take the first two columns of the
U
and
V
matrices and the first two columns and rows of the
S
matrix to get the least squares best approximation of the original data. This fits with the results of the same procedure in the paper I mentioned above:

U[ , 1:2] %*% S[1:2, 1:2] %*% t(V[ , 1:2])


I am wanting to make sure I understand what this function is doing (as best as I am able), and I have been able to generate the
V
and
S
matrices to match those from the
svd()
function:

ATA <- t(wxd) %*% wxd
V2 <- eigen(ATA)$vectors

S2 <- sqrt(diag(eigen(ATA)$values))


But, the
U
matrix I generate has the same absolute values for the first 9 columns then adds an additional 3 columns. And some elements of this
U
matrix have different signs than the
U
matrix from the
svd()
function:

AAT <- wxd %*% t(wxd)
U2 <- eigen(AAT)$vectors


So my question is, why is the
U
matrix different than when I attempt to calculate it from scratch?

Answer

wxd has rank of 9. Therefore, your AAT only has 9 non-zero eigenvalues (the rest are very small ~1e-16). For those zero eigenvalues, the eigenvectors are arbitrary as long as they span the subspace orthogonal to that spanned by the other eigenvectors in R^12.

Now, by default svd only computes nu=min(n,p) left singular vectors (similarly for right eigenvectors) where n is the number of rows and p is the number of columns in the input (see ?svd). Therefore, you only get 9 left singular vectors. To generate all 12, call svd with:

svd(wxd,nu=nrow(wxd))

However, those extra 3 left singular vectors will not correspond to those found with eigen(AAT)$vectors again because these eigenvectors are determined somewhat arbitrarily to span that orthogonal subspace.

As for why some of the signs have changed, recall that eigenvectors are only determined up to a scale factor. Although these eigenvectors are normalized, they may differ by a factor of -1. To check just divide one from U with the corresponding one from U2. You should get columns of all 1s or -1s:

U[,1:9]/U2[,1:9]
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,]    1   -1    1   -1    1   -1    1    1    1
## [2,]    1   -1    1   -1    1   -1    1    1    1
## [3,]    1   -1    1   -1    1   -1    1    1    1
## [4,]    1   -1    1   -1    1   -1    1    1    1
## [5,]    1   -1    1   -1    1   -1    1    1    1
## [6,]    1   -1    1   -1    1   -1    1    1    1
## [7,]    1   -1    1   -1    1   -1    1    1    1
## [8,]    1   -1    1   -1    1   -1    1    1    1
## [9,]    1   -1    1   -1    1   -1    1    1    1
##[10,]    1   -1    1   -1    1   -1    1    1    1
##[11,]    1   -1    1   -1    1   -1    1    1    1
##[12,]    1   -1    1   -1    1   -1    1    1    1

Update to explain why Eigenvector is only determined up to a scale factor

This can be seen from the definition of the eigenvector. From Wikipedia,

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a non-zero vector that does not change its direction when that linear transformation is applied to it.

In finite-dimensional vector space, the linear transformation is in terms of multiplying the vector with a square matrix A, and therefore the definition is (This is where I wish SO supports LaTeX markdown as this is not an equation in code; that is * is matrix-multiply here):

A * v = lambda * v

which is known as the Eigenvalue Equation for the matrix A where lambda is the eigenvalue associated with the eigenvector v. From this equation, it is clear that if v is an eigenvector of A then any k * v for some scalar k is also an eigenvector of A with associated eigenvalue lambda.