RickyB RickyB - 1 month ago 18
R Question

Replicate() versus a for loop?

Does anyone know how the replicate() function works in R and how efficient it is relative to using a for loop?

For example, is there any efficiency difference between...

means <- replicate(100000, mean(rnorm(50)))


And...

means <- c()
for(i in 1:100000) {
means <- c(means, mean(rnorm(50)))
}


(I may have typed something slightly off above, but you get the idea.)

Answer

You can just benchmark the code and get your answer empirically. Note that I also added a second for loop flavor which circumvents the growing vector problem by preallocating the vector.

repl_function = function(no_rep) means <- replicate(no_rep, mean(rnorm(50)))
for_loop = function(no_rep) {
   means <- c()
   for(i in 1:no_rep) { 
      means <- c(means, mean(rnorm(50)))
   }
   means
}
for_loop_prealloc = function(no_rep) {
   means <- vector(mode = "numeric", length = no_rep)
   for(i in 1:no_rep) { 
      means[i] <- mean(rnorm(50))
   }
   means
}

no_loops = 50e3
benchmark(repl_function(no_loops), 
          for_loop(no_loops), 
          for_loop_prealloc(no_loops), 
          replications = 3)

                         test replications elapsed relative user.self sys.self
2          for_loop(no_loops)            3  18.886    6.274    17.803    0.894                          
3 for_loop_prealloc(no_loops)            3   3.209    1.066     3.189    0.000                          
1     repl_function(no_loops)            3   3.010    1.000     2.997    0.000                          
  user.child sys.child
2          0         0                                                                                  
3          0         0                                                                                  
1          0         0 

Looking at the relative column, the un-preallocated for loop is 6.2 times slower. However, the preallocated for loop is just as fast as replicate.