snarble - 1 month ago 4x
R Question

# Reproducing Singular Value Decompition in R

I have an example word by document matrix (from Landauer and Dumais, 1997):

``````wxd <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,
0,0,1,1,1,1,1,0,1,0,0,0,
0,1,0,1,1,0,0,1,0,0,0,0,
1,0,0,0,2,0,0,1,0,0,0,0,
0,0,0,1,0,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,
0,0,0,0,0,0,0,0,0,1,1,0,
0,0,0,0,0,0,0,0,0,1,1,1,
0,0,0,0,0,0,0,0,1,0,1,1)
,12, 9)
rownames(wxd) <- c("human", "interface", "computer", "user", "system",
"response", "time", "EPS", "survey", "trees", "graph", "minors")
colnames(wxd) <- c(paste0("c", 1:5), paste0("m", 1:4))
``````

I can perform Singular Value Decomposition on this matrix using the
`svd()`
function and have three matrices
`U`
,
`S`
, and
`V`
:

``````SVD <- svd(wxd)
U <- SVD\$u
S <- diag(SVD\$d)
V <- SVD\$v
``````

I can multiply these matrices and get my original matrix returned (within some small margin or error):

``````U %*% S %*% t(V)
``````

I can also take the first two columns of the
`U`
and
`V`
matrices and the first two columns and rows of the
`S`
matrix to get the least squares best approximation of the original data. This fits with the results of the same procedure in the paper I mentioned above:

``````U[ , 1:2] %*% S[1:2, 1:2] %*% t(V[ , 1:2])
``````

I am wanting to make sure I understand what this function is doing (as best as I am able), and I have been able to generate the
`V`
and
`S`
matrices to match those from the
`svd()`
function:

``````ATA <- t(wxd) %*% wxd
V2 <- eigen(ATA)\$vectors

S2 <- sqrt(diag(eigen(ATA)\$values))
``````

But, the
`U`
matrix I generate has the same absolute values for the first 9 columns then adds an additional 3 columns. And some elements of this
`U`
matrix have different signs than the
`U`
matrix from the
`svd()`
function:

``````AAT <- wxd %*% t(wxd)
U2 <- eigen(AAT)\$vectors
``````

So my question is, why is the
`U`
matrix different than when I attempt to calculate it from scratch?

`wxd` has rank of `9`. Therefore, your `AAT` only has `9` non-zero eigenvalues (the rest are very small `~1e-16`). For those zero eigenvalues, the eigenvectors are arbitrary as long as they span the subspace orthogonal to that spanned by the other eigenvectors in R^12.

Now, by default `svd` only computes `nu=min(n,p)` left singular vectors (similarly for right eigenvectors) where `n` is the number of rows and `p` is the number of columns in the input (see `?svd`). Therefore, you only get `9` left singular vectors. To generate all `12`, call `svd` with:

``````svd(wxd,nu=nrow(wxd))
``````

However, those extra `3` left singular vectors will not correspond to those found with `eigen(AAT)\$vectors` again because these eigenvectors are determined somewhat arbitrarily to span that orthogonal subspace.

As for why some of the signs have changed, recall that eigenvectors are only determined up to a scale factor. Although these eigenvectors are normalized, they may differ by a factor of `-1`. To check just divide one from `U` with the corresponding one from `U2`. You should get columns of all `1`s or `-1`s:

``````U[,1:9]/U2[,1:9]
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,]    1   -1    1   -1    1   -1    1    1    1
## [2,]    1   -1    1   -1    1   -1    1    1    1
## [3,]    1   -1    1   -1    1   -1    1    1    1
## [4,]    1   -1    1   -1    1   -1    1    1    1
## [5,]    1   -1    1   -1    1   -1    1    1    1
## [6,]    1   -1    1   -1    1   -1    1    1    1
## [7,]    1   -1    1   -1    1   -1    1    1    1
## [8,]    1   -1    1   -1    1   -1    1    1    1
## [9,]    1   -1    1   -1    1   -1    1    1    1
##[10,]    1   -1    1   -1    1   -1    1    1    1
##[11,]    1   -1    1   -1    1   -1    1    1    1
##[12,]    1   -1    1   -1    1   -1    1    1    1
``````

### Update to explain why Eigenvector is only determined up to a scale factor

This can be seen from the definition of the eigenvector. From Wikipedia,

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a non-zero vector that does not change its direction when that linear transformation is applied to it.

In finite-dimensional vector space, the linear transformation is in terms of multiplying the vector with a square matrix `A`, and therefore the definition is (This is where I wish SO supports LaTeX markdown as this is not an equation in code; that is `*` is matrix-multiply here):

``````A * v = lambda * v
``````

which is known as the Eigenvalue Equation for the matrix `A` where `lambda` is the eigenvalue associated with the eigenvector `v`. From this equation, it is clear that if `v` is an eigenvector of `A` then any `k * v` for some scalar `k` is also an eigenvector of `A` with associated eigenvalue `lambda`.

Source (Stackoverflow)