Plinth Plinth - 2 months ago 19
R Question

R How to get total CPU time with foreach?

I am trying to get total CPU hours of a code run in parallel (using

foreach
from the package
doParallel
) but I'm not sure how to go about doing this. I have used
proc.time()
but it just returns a difference in 'real' time. From what I have read of
system.time()
, it should also just do the same as
proc.time()
. How do I get total CPU hours of an R code run in parallel?

Answer

A Little trick is to return the measured runtime with your computation result together by list. An example as below, we use system.time() to get the runtime as same as proc.time().

NOTE: this is the modified example from my blog post of R with Parallel Computing from User Perspectives.

# fake code to show how to get runtime of each process in foreach
library(foreach)
library(doParallel)

# Real physical cores in my computer
cores <- detectCores(logical = FALSE)
cl <- makeCluster(cores)
registerDoParallel(cl, cores=cores)


system.time(
  res.gather <- foreach(i=1:cores, .combine='list') %dopar%
  {  
    s.time <- system.time( {
    set.seed(i)
    res <- matrix(runif(10^6), nrow=1000, ncol=1000)
    res <- exp(sqrt(res)*sqrt(res^3))
    })
    list(result=res, runtime=s.time)
  }
)


stopImplicitCluster()
stopCluster(cl)

Thus, the runtime is saved in res.gather and you can get it easily. So, add them up and we can know how many total time for your parallel program.

> res.gather[[1]]$runtime
   user  system elapsed 
   0.42    0.04    0.48 
> res.gather[[2]]$runtime
   user  system elapsed 
   0.42    0.03    0.47 
> res.gather[[2]]$runtime[3] + res.gather[[2]]$runtime[3]
elapsed 
   0.94 

Finally, the runtime of 2 R sessions is 0.94 sec without accounting wait time of R master.