Abhishek - 5 months ago 44

R Question

I am using

`randomForest`

`rf_object<-randomForest(data_matrix, label_factor, cutoff=c(k,1-k))`

where k ranges from 0.1 to 0.9.

`pred <- predict(rf_object,test_data_matrix)`

I have the output from the random forest classifier and I compared it with the labels. So, I have the performance measures like accuracy, MCC, sensitivity, specificity, etc for 9 cutoff points.

Now, I want to plot the ROC curve and obtain the area under the ROC curve to see how good the performance is. Most of the packages in R (like ROCR, pROC) require prediction and labels but I have sensitivity (TPR) and specificity (1-FPR).

Can any one suggest me if the cutoff method is correct or reliable to produce ROC curve?

Do you know any way to obtain ROC curve and area under the curve using TPR and FPR?

I also tried to use the following command to train random forest. This way the predictions were continuous and were acceptable to

`ROCR`

`pROC`

`rf_object <- randomForest(data_matrix, label_vector)`

pred <- predict(rf_object, test_data_matrix)

Thank you for your time reading my problem! I have spent long time surfing for this. Thank you for your suggestion/advice.

Answer

Why don't you output class probabilities ? This way, you have a ranking of your predictions and you can directly input that to any ROC package.

```
m = randomForest(data_matrix, labels)
predict(m,newdata_matrix,type='prob')
```

Note that, to use randomForest as a classification tool, `labels`

must be a vector of factor.