moshem moshem - 10 months ago 105
R Question

J48 tree in R - train and test classification

I want to use train and test in J48 decision-tree on R.
here is my code:

library("RWeka")

data <- read.csv("try.csv")
resultJ48 <- J48(classificationTry~., data)

summary(resultJ48)


but I want to split my data into 70% train and 30% test, how can I use the J48 algo to do it?

many thanks!

knb knb
Answer Source

use the sample.split() function of the caTools package. It is more leightweight than the caret package (which is a meta package if I remember correctly):

library(caTools)

library(RWeka)

data <- read.csv("try.csv")
spl = sample.split(data$someAttribute, SplitRatio = 0.7)

dataTrain = subset(data, spl==TRUE)
dataTest = subset(data, spl==FALSE)

resultJ48 <- J48(as.factor(classAttribute)~., dataTrain) 
dataTest.pred <- predict(resultJ48, newdata = dataTest)
table(dataTest$classAttribute, dataTest.pred)