donpresente donpresente -4 years ago 208
R Question

Random Forest - Caret - Time Series

I have a time series (apple stock prices -closing prices- turn into a data frame to fit a random forest using caret. I lagged on 1 day, 2 days and 6 days. I want to predict the next 2 days. Two step ahead forecast. But

caret
uses the
predict
function that does not allow the argument
h
as the
forecast
function. And i have seen that some people try to put the argument
n.ahead
but is not working for me. Any advice? See the code

df<-data.frame(APPL)
df$f1<-lag(df$APPL,1)
df$f2=lag(df$APPL,2)
df$f3=lag(df$APPL,6)

# change column names

colnames(df)<-c("price","price_1","price_2","price_6)

# remove rows (days) with NA.
df<-df[complete.cases(df),]

fitControl <- trainControl(
method = "repeatedcv",
number = 10,
repeats = 1,
classProbs = FALSE,
verboseIter = TRUE,
preProcOptions=list(thresh=0.95,na.remove=TRUE,verbose=TRUE))

set.seed(1234)

rf_grid= expand.grid(mtry = c(1:3))

fit <- train(price~.,
data=df,
method="rf",
preProcess=c("center","scale"),
tuneGrid = rf_grid,
trControl=fitControl,
ntree = 200,
metric="RMSE")


nextday <- predict(fit,`WHAT GOES HERE?`)


If i put just
predict(fit)
uses as
newdata
the whole dataset. Which i think is wrong. The other thing i was thinking about is to do a loop. Predict for 1 step ahead, because i have the data of 1,2 and 6 days ago. And the fill for the 2 step ahead forecast the 1 day ago "cell" with the forecast i did before.

Answer Source

Right now, you can't pass other options to the underlying predict method. There is a proposed change that might enable this though.

In your case, you should give the predict function a data frame that has the appropriate predictors for the next few observations.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download