Amandeep Rathee - 8 months ago 45
R Question

# How to perform ensembling in a classifier model in R

I have a data frame where the variable to be predicted has 28 possible factor outcomes.

Now I run three classifier algorithms on the training data set which are support vector machine(SVM), random forest(RF) and k-nearest neighbor(kNN).

Now I have three prediction vectors corresponding to the three algorithms mentioned above. All of these have a good accuracy of about 80-90%.

I want to ensemble them and predict the final outcome variable based on voting system of the three algorithms.
Note: SVM has highest accuracy followed by RF and then kNN.
For example:

SVM prediction | RF prediction | KNN prediction|Final outcome
---------------|---------------|---------------|-------------

A              |A              |C              |A
---------------|---------------|---------------|-------------

D              |J              |D              |D
---------------|---------------|---------------|-------------

C              |C              |C              |C

---------------|---------------|---------------|-------------
I              |F              |K              |I (pick SVM's outcome in case of a tie)


As you can see what I want is very simple. How can I perform this in R programming ? And is there any other way of performing ensemble modelling in this situation ?

There is a statistical term for Popular voting : mode

SVMprediction  <- c('A','D','C','I')
RFprediction   <- c('A','J', 'C','F')
KNNprediction  <- c('C','D', 'C','K')
data <- data.frame(SVMprediction, RFprediction , KNNprediction)

### Create the function.
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}

apply(data,1,getmode)


[1] "A" "D" "C" "I"

So, I can use it for n number of ensembling of predictors

Does it help?