Leandro Costa - 2 months ago 54

R Question

I'm new to programming in R, but i'm trying to learn parallel computing and the use of foreach.

Altough, in my case I need to loop combinations of possibilities to find the best MAPE and the best variables related to it.

I started to do nested for loops, but with 180k rows and trying to find the best combination of 3 variables it ran for 2 days straight and it didn't stop.

This the code just for 2 variables but i think you can understand the logic.

`for (i in names(df3)) {`

for (j in names(df3)) {

name4 = names(df3["DiasAusencia"])

if (i != name4 && j != name4 && i != j) {

df4 = df3[, c(i, j, "DiasAusencia")]

H = holdout(df4$DiasAusencia, ratio = 2 / 3)

Fi = fit(DiasAusencia ~ ., df4[H$tr,], model = "svm")

testDf = df4[H$ts,]

P = predict(Fi, testDf)

MAE = mmetric(testDf$DiasAusencia, P, metric = "MAE")

MAPE = mmetric(testDf$DiasAusencia, P, metric = "MAPE")

res = cbind(testDf, predicted = P, MAE, MAPE)

if (MAPE < BESTMAPE) {

BESTMAPE = MAPE

bestres = res

}

}

}

}

So I've looked into the foreach documentation and tried to apply it to this problem, so I could run all the combinations possible, but with no success so far. This is my foreach code:

`svm3 = function(var1, var2){`

if (var1 != name4 && var2 != name4 && var1 != var2) {

df4 = df3[, c(var1, var2, "DiasAusencia")]

H = holdout(df4$DiasAusencia, ratio = 2 / 3)

Fi = fit(DiasAusencia ~ ., df4[H$tr,], model = "svm")

testDf = df4[H$ts,]

P = predict(Fi, testDf)

MAE = mmetric(testDf$DiasAusencia, P, metric = "MAE")

MAPE = mmetric(testDf$DiasAusencia, P, metric = "MAPE")

res = cbind(testDf, predicted = P, MAE, MAPE)

return(MAPE)

}

}

sol = foreach(i=1:ncols, j=1:ncols, .combine = rbind, .packages="rminer")%dopar%{

var1 = names(df3[i])

var2 = names(df3[j])

name4 = names(df3["DiasAusencia"])

svm3(var1, var2)

tmp = matrix(MAPE, ncol = ncols)

return(tmp)

}

This is the error i get

Error in { : task 1 failed - "object 'MAPE' not found"

Hope you guys can help me out with this problem!

Thank you in advance.

Answer Source

You're not assigning the return from `svm3`

to anything:

```
svm3(var1, var2)
tmp = matrix(MAPE, ncol = ncols)
```

so there's nothing called `MAPE`

in the second line above.

```
MAPE = svm3(var1, var2)
```

should fix it.