statquant statquant - 4 months ago 48
R Question

How to log when using foreach (print or futile.logger)

I want to use the

foreach
package in conjunction with logging. I usually use the
futile.logger
package. When work is given to the workers logging information is lost (which is strange as you need to indicate to foreach the logging package)

I've seen this post but it does not use foreach

library(foreach)
library(futile.logger)
library(doParallel)
flog.threshold(DEBUG)
cluster <- makeCluster(8)
registerDoParallel(cluster)
doStuff <- function(input){
flog.debug('Doing some stuff with %s', input)
return(input)
}
res <- lapply(FUN=doStuff, X=seq(1,8,1))
# >> this prints
res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input)
# >> this prints
res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)
# >> this does not
identical(res,res2) && identical(res,res3)


I do not really care about the parallele backend, can be anything, but how can I symply get the logging working

Answer

Following the solution from How can I print when using %dopar%: the idea is to use snow to set up your cluster, and set outfile="" to redirect worker output to master.

library(foreach)
library(futile.logger)
library(doParallel)

library(doSNOW)
cluster <- makeCluster(3, outfile="") # I only have 4 cores, but you could do 8
registerDoSNOW(cluster)
flog.threshold(DEBUG)

doStuff <- function(input){
  flog.info('Doing some stuff with %s', input) # change to flog.info
  return(input) 
  } 
res <- lapply(FUN=doStuff, X=seq(1,8,1))
# >> this prints                                                              
res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input) 
# >> this prints
res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
# >> this prints too

Output:

> res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 3
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 1
INFO [2016-08-08 08:22:39] Doing some stuff with 2
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 5
INFO [2016-08-08 08:22:39] Doing some stuff with 4
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 6
INFO [2016-08-08 08:22:39] Doing some stuff with 7
INFO [2016-08-08 08:22:39] Doing some stuff with 8

Output to log file. Here's an alternative that outputs to a log file, following How to log using futile logger from within a parallel method in R?. It has the advantage of having a cleaner output, but still requires flog.info:

library(doSNOW)
library(foreach)
library(futile.logger)
nworkers <- 3
cluster <- makeCluster(nworkers)
registerDoSNOW(cluster)
loginit <- function(logfile) flog.appender(appender.file(logfile))
foreach(input=rep('~/Desktop/out.log', nworkers), 
  .packages='futile.logger') %dopar% loginit(input)
doStuff <- function(input){
  flog.info('Doing some stuff with %s', input)
  return(input) 
  } 
foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input) 
stopCluster(cluster)
readLines("~/Desktop/out.log")

Output:

> readLines("~/Desktop/out.log")
[1] "INFO [2016-08-08 10:07:30] Doing some stuff with 2"
[2] "INFO [2016-08-08 10:07:30] Doing some stuff with 1"
[3] "INFO [2016-08-08 10:07:30] Doing some stuff with 3"
[4] "INFO [2016-08-08 10:07:30] Doing some stuff with 4"
[5] "INFO [2016-08-08 10:07:30] Doing some stuff with 5"
[6] "INFO [2016-08-08 10:07:30] Doing some stuff with 6"
[7] "INFO [2016-08-08 10:07:30] Doing some stuff with 7"
[8] "INFO [2016-08-08 10:07:30] Doing some stuff with 8"