Xizam Xizam - 3 months ago 14
R Question

Running out of memory when trying to perform task with apply rather than for loop in R

I'm trying to rewrite some old code in order for it to be more efficient. I read in my places that using apply should be faster than using a for loop, so I attempted to do this. First the old working code:

dl=data.frame(replicate(16,1:15685849))
#in line below mean was sums, but this gave integer overflows. This is not the case in the real dataset, but for the purpose of this example mean will do.
sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)
appel<-dl[,1:3]
for (i in 1:(ncol(dl)-3)){
appel[,i+3]=dl[,i+3]/sums[i]
}


No problems so far. I was trying to rewrite this code as a function so I can maken an R package for private use. This was my attempt

dl=data.frame(replicate(16,1:15685849))
depthnormalise=function(tonormtable, skipleftcol=3){
sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)
dn=function(x){x/sums}
tonormtable[,(skipleftcol+1):ncol(tonormtable)]=t(apply(tonormtable[,(skipleftcol+1):ncol(tonormtable)], 1, dn))
}
appel=depthnormalise(dl)


but this will run me out of memory.

I have very little experience using apply, but I can't seem to get it figured out properly for a table where I want to leave the first 3 columns as is and only change the ones after that. If any more information is required please let me know before downvoting! If you only downvote, I won't get better.

Answer

Here is a working apply solution:

appel1 <- as.matrix(dl)
appel1[, -(1:3)] <- apply(appel1[, -(1:3)], 2, 
                          function(x) round(x / mean(x) * 1e6, digits=2))
all.equal(as.matrix(appel), appel1)
#[1] TRUE

However, as said in the comments, it won't be faster than a well-written for loop. It's slower on my system.

Comments