Michael Gruenstaeudl Michael Gruenstaeudl - 1 year ago 69
R Question

Inconsistent behaviour when using lapply and anova in R

The documentation for function

provides an example in which five different linear models are set up and compared via function

fit0 <- lm(sr ~ 1, data = LifeCycleSavings)
fit1 <- update(fit0, . ~ . + pop15)
fit2 <- update(fit1, . ~ . + pop75)
fit3 <- update(fit2, . ~ . + dpi)
fit4 <- update(fit3, . ~ . + ddpi)
anova(fit0, fit1, fit2, fit3, fit4, test = "F")

You could also use
to execute the function
over the models consecutively.

fit_L = list(fit0, fit1, fit2, fit3, fit4)
lapply(fit_L, anova)

Analogously, the documentation for function
of the package diversitree provides an example in which two models are set up and compared via function

pars <- c(0.1, 0.2, 0.03, 0.03, 0.01, 0.01)
phy <- tree.bisse(pars, max.t=60, x0=0)
lik <- make.bisse(phy, phy$tip.state)
fit <- find.mle(lik, pars)
lik.l <- constrain(lik, lambda0 ~ lambda1)
fit.l <- find.mle(lik.l, pars[-2])
anova(fit, equal.lambda=fit.l)

Here, however, I cannot use lapply to execute the function
over the two models.

fit_L = list(fit, fit.l)
lapply(fit_L, anova)
# Error in anova.fit.mle(X[[i]], ...) : Need to specify more than one model

Can anyone think of a way to use
(or similar functions) to the example from package


To clarify my question: The underlying idea of my post is to make
independent of the precise number of models to be tested. For some analyses, I don't know how many models will be tested a priori, so it would be nice to
the anova across however many models happen to be in list

Answer Source

lapply iterates over the lements of a list and applies a function to them. This is not what you want. You want to pass all list elements as arguments to a function, which is what do.call does:

do.call(anova, c(fit_L, test = "F"))

If you look at your examples with anova.lm, you see that the output is different if you use lapply. From the documentation:

Specifying a single object gives a sequential analysis of variance table for that fit. [...] If more than one object is specified, the table has a row for the residual degrees of freedom and sum of squares for each model. For all but the first model, the change in degrees of freedom and sum of squares is also given. ...

lapply passes single objects to anova.lm. This doesn't work for your mle fits because the corresponding anova method only does model comparison.