Yang Yang Yang Yang - 17 days ago 5
R Question

How to show the progress of code in parallel computation in R?

I am now dealing with a large dataset and some functions may take hours to process. I wonder how I can show the progress of the code through a progress bar or number(1,2,3,...,100). And I want to store the result as a data frame with two columns. Here is an example. Thanks.

require(foreach)
require(doParallel)
require(Kendall)

cores=detectCores()
cl <- makeCluster(cores-1)
registerDoParallel(cl)

mydata=matrix(rnorm(8000*500),ncol = 500)
result=as.data.frame(matrix(nrow = 8000,ncol = 2))
pb <- txtProgressBar(min = 1, max = 8000, style = 3)

foreach(i=1:8000,.packages = "Kendall",.combine = rbind) %dopar%
{
abc=MannKendall(mydata[i,])
result[i,1]=abc$tau
result[i,2]=abc$sl
setTxtProgressBar(pb, i)
}
close(pb)
stopCluster(cl)


However, when I run the code, I did not see any progress bar showing up and the result is not right. Is there any suggestion? Thanks.

Answer

The doSNOW package has support for progress bars, while doParallel does not. Here's a way to put a progress bar in your example:

require(doSNOW)
require(Kendall)
cores <- parallel::detectCores()
cl <- makeSOCKcluster(cores)
registerDoSNOW(cl)
mydata <- matrix(rnorm(8000*500), ncol=500)
pb <- txtProgressBar(min=1, max=8000, style=3)
progress <- function(n) setTxtProgressBar(pb, n)
opts <- list(progress=progress)
result <- 
  foreach(i=1:8000, .packages="Kendall", .options.snow=opts,
          .combine='rbind') %dopar% {
    abc <- MannKendall(mydata[i,])
    data.frame(tau=abc$tau, sl=abc$sl)
  }
close(pb)
stopCluster(cl)