Divi - 8 months ago 110

R Question

I have a large dataset with multiple classes. My aim to fit a model to each class, and then predict the results and visualize them for each class in a facet.

For a reproducible example, I have created something basic using

`mtcars`

`mtcars = data.table(mtcars)`

model = mtcars[, list(fit = list(lm(mpg~disp+hp+wt))), keyby = cyl]

setkey(mtcars, cyl)

mtcars[model, pred := predict(i.fit[[1]], .SD), by = .EACHI]

ggplot(data = mtcars, aes(x = mpg, y = pred)) + geom_line() + facet_wrap(~cyl)

However, I would like to try something like below, which does not yet work. This try is with a list of formula, but I am also looking to send different models (some glms, a few trees) to each subset of data.

`mtcars = data.table(mtcars)`

factors = list(c("disp","wt"), c("disp"), c("hp"))

form = lapply(factors, function(x) as.formula(paste("mpg~",paste(x,collapse="+"))))

model = mtcars[, list(fit = list(lm(form))), keyby = cyl]

setkey(mtcars, cyl)

mtcars[model, pred := predict(i.fit[[1]], .SD), by = .EACHI]

ggplot(data = mtcars, aes(x = mpg, y = pred)) + geom_line() + facet_wrap(~cyl)

Answer

Here's an approach where we set up `predict`

for each model as an unevaluated list, evaluate them within the `data.table`

object, `gather`

the output, and pass it into `ggplot`

:

```
models = quote(list(
predict(lm(form[[1]], .SD)),
predict(lm(form[[2]], .SD)),
predict(lm(form[[3]], .SD))))
d <- mtcars
d[, c("est1", "est2", "est3") := eval(models), by = cyl]
d <- tidyr::gather(d, key = model, value = pred, est1:est3)
library(ggplot2)
ggplot(d, aes(x = mpg, y = pred)) + geom_line() + facet_grid(cyl ~ model)
```

Output: