Xizam - 1 year ago 73

R Question

I'm trying to rewrite some old code in order for it to be more efficient. I read in my places that using apply should be faster than using a for loop, so I attempted to do this. First the old working code:

`dl=data.frame(replicate(16,1:15685849))`

#in line below mean was sums, but this gave integer overflows. This is not the case in the real dataset, but for the purpose of this example mean will do.

sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)

appel<-dl[,1:3]

for (i in 1:(ncol(dl)-3)){

appel[,i+3]=dl[,i+3]/sums[i]

}

No problems so far. I was trying to rewrite this code as a function so I can maken an R package for private use. This was my attempt

`dl=data.frame(replicate(16,1:15685849))`

depthnormalise=function(tonormtable, skipleftcol=3){

sums<-mapply(mean, dl[,4:ncol(dl)], USE.NAMES=FALSE)

dn=function(x){x/sums}

tonormtable[,(skipleftcol+1):ncol(tonormtable)]=t(apply(tonormtable[,(skipleftcol+1):ncol(tonormtable)], 1, dn))

}

appel=depthnormalise(dl)

but this will run me out of memory.

I have very little experience using apply, but I can't seem to get it figured out properly for a table where I want to leave the first 3 columns as is and only change the ones after that. If any more information is required please let me know before downvoting! If you only downvote, I won't get better.

Answer Source

Here is a working `apply`

solution:

```
appel1 <- as.matrix(dl)
appel1[, -(1:3)] <- apply(appel1[, -(1:3)], 2,
function(x) round(x / mean(x) * 1e6, digits=2))
all.equal(as.matrix(appel), appel1)
#[1] TRUE
```

However, as said in the comments, it won't be faster than a well-written `for`

loop. It's slower on my system.