Mud Warrior Mud Warrior - 2 months ago 6
R Question

How can I compute the median absolute deviation (MAD) for generalized linear mixed-effects models

I know my question is linked to stats but I'm looking for a solution in

, so I believe it's suited for SO.

I built a generalized linear mixed-effects model (GLMM) using the
function from the
package in
to model species richness around aquaculture sites based on significant explanatory variables using Zuur et al. (2009) Mixed Effects Models and Extensions in Ecology with R. The model is:

Mod1 <- glmer(Richness ~ Distance + Depth + Substrate + Beggiatoa +
Distance*Beggiatoa + (1|Site/transect), family = poisson, data = mydata)

Now I have a full data set collected at different sites and I want to assess how this model performs on the new data set.

Following a question on CV, someone suggested to look for the median absolute deviation (mad) on the new data set. I tried the
function from the
package in
but I get the following error message:

Error in x[!] : object of type 'S4' is not subsettable
In addition: Warning messages:
1: In : applied to non-(list or vector) of type 'S4'
2: In : applied to non-(list or vector) of type 'S4'

Does anybody knows what's going wrong here? Is it that
can't be calculated for GLMMs? If so, is there another
package to calculate mad from GLMMs?


To give you an idea of my data, here's the output from
, also note that there's no "Substrate" category in the new data set and "S" refers to "Richness":

structure(list(S = c(0, 1, 2, 3, 3, 2), Site = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("BC", "BH", "GC", "IS", "Ref"
), class = "factor"), Transect = structure(c(4L, 4L, 4L, 4L,
4L, 4L), .Label = c("10GC", "10IS", "10N", "10S", "11IS", "12IS",
"13E", "1GC", "1N", "1W", "2E", "2GC", "2IS", "2N", "2W", "2WA",
"3E", "3GC", "3IS", "3N", "3S", "4E", "4GC", "4IS", "4S", "4W",
"5GC", "5IS", "5S", "6GC", "6IS", "6N", "6S", "6W", "7E", "7GC",
"7IS", "8GC", "8IS", "8W", "9E", "9GC", "9IS", "9N", "RefBC1",
"RefBC10", "RefBC11", "RefBC12", "RefBC2", "RefBC3", "RefBC4",
"RefBC5", "RefBC6", "RefBC7", "RefBC8", "RefBC9", "X1", "X2"), class = "factor"),
Distance = c(2, 20, 40, 80, 120, 160), Depth = c(40L, 40L,
50L, 40L, 40L, 40L), Beggiatoa = c(2, 1, 1, 0, 0, 0)), .Names = c("S",
"Site", "Transect", "Distance", "Depth", "Beggiatoa"), row.names = c(NA,
6L), class = "data.frame")


For within-sample error, the mean absolute deviation computation would just be


... you might want residuals(fitted_model,type="response"), since residuals will give you deviance residuals by default (see ?residuals.merMod)

If you want to look at out-of-sample error, you could do something like this:

pred <- predict(fitted_model,newdata=newdf,type="response",

(re.form=~0 specifies that you want to omit random effects from the prediction, which is your only choice unless you're predicting at sites/transects where you've also got training data)