Xizam Xizam - 7 months ago 46
R Question

Running out of memory when trying to perform task with apply rather than for loop in R

I'm trying to rewrite some old code in order for it to be more efficient. I read in my places that using apply should be faster than using a for loop, so I attempted to do this. First the old working code:

#in line below mean was sums, but this gave integer overflows. This is not the case in the real dataset, but for the purpose of this example mean will do.
sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)
for (i in 1:(ncol(dl)-3)){

No problems so far. I was trying to rewrite this code as a function so I can maken an R package for private use. This was my attempt

depthnormalise=function(tonormtable, skipleftcol=3){
sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)
tonormtable[,(skipleftcol+1):ncol(tonormtable)]=t(apply(tonormtable[,(skipleftcol+1):ncol(tonormtable)], 1, dn))

but this will run me out of memory.

I have very little experience using apply, but I can't seem to get it figured out properly for a table where I want to leave the first 3 columns as is and only change the ones after that. If any more information is required please let me know before downvoting! If you only downvote, I won't get better.


Here is a working apply solution:

appel1 <- as.matrix(dl)
appel1[, -(1:3)] <- apply(appel1[, -(1:3)], 2, 
                          function(x) round(x / mean(x) * 1e6, digits=2))
all.equal(as.matrix(appel), appel1)
#[1] TRUE

However, as said in the comments, it won't be faster than a well-written for loop. It's slower on my system.