Amandeep Rathee Amandeep Rathee - 5 days ago 5
R Question

How to perform ensembling in a classifier model in R

I have a data frame where the variable to be predicted has 28 possible factor outcomes.

Now I run three classifier algorithms on the training data set which are support vector machine(SVM), random forest(RF) and k-nearest neighbor(kNN).

Now I have three prediction vectors corresponding to the three algorithms mentioned above. All of these have a good accuracy of about 80-90%.

I want to ensemble them and predict the final outcome variable based on voting system of the three algorithms.
Note: SVM has highest accuracy followed by RF and then kNN.
For example:

SVM prediction | RF prediction | KNN prediction|Final outcome
---------------|---------------|---------------|-------------

A |A |C |A
---------------|---------------|---------------|-------------


D |J |D |D
---------------|---------------|---------------|-------------

C |C |C |C

---------------|---------------|---------------|-------------
I |F |K |I (pick SVM's outcome in case of a tie)


As you can see what I want is very simple. How can I perform this in R programming ? And is there any other way of performing ensemble modelling in this situation ?

Answer

There is a statistical term for Popular voting : mode

SVMprediction  <- c('A','D','C','I')
RFprediction   <- c('A','J', 'C','F')
KNNprediction  <- c('C','D', 'C','K')
data <- data.frame(SVMprediction, RFprediction , KNNprediction)

### Create the function.
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}

apply(data,1,getmode)

[1] "A" "D" "C" "I"

So, I can use it for n number of ensembling of predictors

Does it help?

Comments