Leandro Costa - 8 months ago 128
R Question

I'm new to programming in R, but i'm trying to learn parallel computing and the use of foreach.

Altough, in my case I need to loop combinations of possibilities to find the best MAPE and the best variables related to it.

I started to do nested for loops, but with 180k rows and trying to find the best combination of 3 variables it ran for 2 days straight and it didn't stop.

This the code just for 2 variables but i think you can understand the logic.

``````for (i in names(df3)) {
for (j in names(df3)) {
name4 = names(df3["DiasAusencia"])

if (i != name4 && j != name4 && i != j) {
df4 = df3[, c(i, j, "DiasAusencia")]
H = holdout(df4\$DiasAusencia, ratio = 2 / 3)
Fi = fit(DiasAusencia ~ ., df4[H\$tr,], model = "svm")
testDf = df4[H\$ts,]
P = predict(Fi, testDf)
MAE = mmetric(testDf\$DiasAusencia, P, metric = "MAE")
MAPE = mmetric(testDf\$DiasAusencia, P, metric = "MAPE")
res = cbind(testDf, predicted = P, MAE, MAPE)

if (MAPE < BESTMAPE) {
BESTMAPE = MAPE
bestres = res
}
}
}
}
``````

So I've looked into the foreach documentation and tried to apply it to this problem, so I could run all the combinations possible, but with no success so far. This is my foreach code:

``````svm3 = function(var1, var2){
if (var1 != name4 && var2 != name4 && var1 != var2) {
df4 = df3[, c(var1, var2, "DiasAusencia")]
H = holdout(df4\$DiasAusencia, ratio = 2 / 3)
Fi = fit(DiasAusencia ~ ., df4[H\$tr,], model = "svm")
testDf = df4[H\$ts,]
P = predict(Fi, testDf)
MAE = mmetric(testDf\$DiasAusencia, P, metric = "MAE")
MAPE = mmetric(testDf\$DiasAusencia, P, metric = "MAPE")
res = cbind(testDf, predicted = P, MAE, MAPE)

return(MAPE)
}
}

sol = foreach(i=1:ncols, j=1:ncols, .combine = rbind, .packages="rminer")%dopar%{
var1 = names(df3[i])
var2 = names(df3[j])
name4 = names(df3["DiasAusencia"])

svm3(var1, var2)

tmp = matrix(MAPE, ncol = ncols)

return(tmp)
}
``````

This is the error i get

Hope you guys can help me out with this problem!

You're not assigning the return from `svm3` to anything:

``````  svm3(var1, var2)

tmp = matrix(MAPE, ncol = ncols)
``````

so there's nothing called `MAPE` in the second line above.

`````` MAPE = svm3(var1, var2)
``````

should fix it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download