Qbik - 1 year ago 121

R Question

The help page for

`randomforest::randomforest()`

"classwt - Priors of the classes. Need not add up to one. Ignored for regression."

Could setting the

`classwt`

How should I set

`classwt`

Answer Source

could setting classwt parameter help when you have heavy unbalanced data - priors of classes differs strongly?

Yes, setting values of classwt could be useful for unbalanced datasets. And I agree with joran, that these values are trasformed in probabilities for sampling training data (according Breiman's arguments in his original article).

How set classwt when in training dataset with 3 classes you have vector of priors equal to (p1,p2,p3), and in test set priors are (q1,q2,q3)?

For training you can simply specify

```
rf <- randomForest(x=x, y=y, classwt=c(p1,p2,p3))
```

For test set no priors can be used: 1) there is no such option in `predict`

method of randomForest package; 2) weights have only sense for training of the model and not for prediction.