Matteo De Felice Matteo De Felice - 1 month ago 9
R Question

Strange behaviour when computing svd on a covariance matrix: different results between Microsoft R and vanilla R

I was doing some principal component analysis on my macbook running Microsoft R 3.3.0 when I got some strange results. Double checking with a colleague, I've realised that the output of the SVD function was different from what I may get by using vanilla R.

This is the reproducible result, please load the file (~78 Mb) here

With Microsoft R 3.3.0 (x86_64-apple-darwin14.5.0) I get:

>> sv <- svd(Cx)
>> print(sv$d[1:10])

[1] 122.73664 104.45759 90.52001 87.21890 81.28256 74.33418 73.29427 66.26472 63.51379
[10] 55.20763


Instead on a vanilla R (both with R 3.3 and R 3.3.1 on two different linux machines):

>> sv <- svd(Cx)
>> print(sv$d[1:10])

[1] 122.73664 34.67177 18.50610 14.04483 8.35690 6.80784 6.14566
[8] 3.91788 3.76016 2.66381


This is not happening with all the data, if I create some random matrix and I apply svd on that, I get the same results. So, it looks like a sort of numerical instability, isn't it?

UPDATE: I've tried to compute the SVD on the same matrix (
Cx
) on the same machine (macbook) with the same version of R by using the
svd
 package and finally I get the "right" numbers. Then it seems due to the svd implementation used by Microsoft R Open.

UPDATE: The behaviour happens also on MRO 3.3.1

Answer

It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.

Comments