Jesse001 - 12 days ago 4x

R Question

I'm at a total loss on this one. I have a large, though not unreasonable, matrix for my data frame in R (48000 * 19). I'm trying to use sm.ancova() to investigate the differential effect slopes, but got

`error: cannot allocate vector of size 13.1GB`

13GB overtaxed the memory allocated to R, I get that. But... what?! The entire CSV file I read in was only 24,000kb. Why are these single vectors so huge in R?

The ancova code I'm using is:

`data1<-read.csv("data.csv")`

attach(data1)

sm.ancova(s,dt,dip,model="none")

Looking in to it a bit, I used:

`diag(s)`

length(s)

diag(dt)

length(dt)

diag(dip)

length(dip)

Which all gave the same error. Their lengths are all 48000.

Any explanation would help. A fix would be better :)

Thanks in advance!

A dummy data link that reproduces this problem can be found at: https://www.dropbox.com/s/dxxofb3o620yaw3/stackexample.csv?dl=0

Answer

Get data:

```
## CSV file is 10M on disk, so it's worth using a faster method
## than read.csv() to import ...
data1 <- data.table::fread("stackexample.csv",data.table=FALSE)
dd <- data1[,c("s","dt","dip")]
```

If you give `diag()`

a vector, it's going to try to make a diagonal matrix with that vector on the diagonal. The example data set you gave us is 96,000 rows long, so `diag()`

applied to any element will try to construct a 96,000 x 96,000 matrix. A 1000x1000 matrix is

```
format(object.size(diag(1000)),"Mb") ## 7.6 Mb
```

so the matrix you're trying to construct here will be 96^2*7.6/1024 = 68 Gb.

A 24Kx24K matrix would be 16 times smaller but still about 4 Gb ...

It *is* possible to use *sparse* matrices to construct big diagonal matrices:

```
library(Matrix)
object.size(Diagonal(x=1:96000))
## 769168 bytes
```

More generally, not all analysis programs are written with computational efficiency (either speed or memory) in mind. The papers on which this method is based (`?sm.ancova`

) were written in the late 1990s, when 24,000 observations would have constituted a huge data set ...

Source (Stackoverflow)

Comments