tpayne - 6 months ago 68

R Question

`library(randomForest)`

library(dyn)

set.seed(123)

tz <- zoo(cbind(Y = rnorm(10), x = rnorm(10)))

tz[10, "Y"] <- NA

rr <- tz

rr<-cbind(`lag(Y, -1)` = lag(rr$Y, -1),rr)

fit <- dyn$randomForest(Y ~ lag(Y,-1) +x , tz, subset = seq_len(10-1))

pred <-predict(fit, newdata=rr)

I am trying to get the random forest to predict the 10th observation, however it keeps coming back as NA. I think it has something to do with the lag value, but am not sure how this works. Anyone know how to make this work?

Answer

I think you were adding an unnecessary line of code.

```
set.seed(123)
tz <- zoo(cbind(Y = rnorm(10), x = rnorm(10)))
tz <- zoo(cbind(Y = rnorm(10), x = rnorm(10)))
rr <- tz
rr<-cbind(`lag(Y, -1)` = lag(rr$Y, -1),rr)
fit <- dyn$randomForest(Y ~ lag(Y,-1) +x , tz, subset = seq_len(10-1))
predict(fit, newdata=rr)
1 2 3 4 5 6 7 8 9 10
0.65469597 0.63274585 0.52821489 -0.58116470 -0.28673507 0.73862391 -0.31800427 -0.59019492 -0.34942432 -0.02772214
```

That extra line is `tz[10, "Y"] <- NA`

. If you remove that, like in above, the 10th element is predicted.