S. Robinson S. Robinson - 7 months ago 10
Bash Question

Saving output from parallel jobs in R into one file

I am running a rather lengthy job that I need to replicate 100 times, thus I have turned to the foreach capability in R which I then run on a 8-core cluster through a shell script. I am trying to input all of my results from each run to the same file. I have included a simplified version of my code.

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
foreach(p=1:100) %dopar%{

functions defining my variables{...}

for(i in 1:fMaxInd){
rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
Qcbar[,i]<-Qflbar-biasCorrV[,i]
sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

}

sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr

} #End of the dopar loop

stopCluster(cl)

write.csv(SigEpsilonSq,file="Sigma_Epsilon_Sq.csv")
write.csv(SigLSq,file="Sigma_L_Sq.csv")
write.csv(RatioMat,file="Ratio_Matrix.csv")


When the job completes, my .csv files are empty. I believe I'm not quite understanding how the foreach saves results and how I can access them. I would like to avoid having to merge files manually. Also, do I need to write
stopCluster(cl)
at the end of my foreach loop or do I wait until the very end? Any help would be much appreciated.

Answer

This is not how foreach works. You should look into examples. You need to use .combine, if you want to output something from your parallelized jobs. Also, instead of this:

sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr 

You have to re-write something like this:

list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

You can also use rbind, cbind, c,... to aggregate the results into one final output. You can even your own combine function, example:

.combine=function(x,y)rbindlist(list(x,y))

The solution below should work. The output should be a list of lists. However it might be painful to retreive results and save them in the correct format. If so, you should design your own .combine function.

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
results = foreach(p=1:100, .combine=list) %dopar%{

  functions defining my variables{...}

  for(i in 1:fMaxInd){
   rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
     sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
     rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
     biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
     Qcbar[,i]<-Qflbar-biasCorrV[,i]
     sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
     ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

   }   

   list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

} #End of the dopar loop

stopCluster(cl)

#Then you extract and save results
Comments