Salmo salar - 1 year ago 375
R Question

# lmer returning error: grouping factor must be < number of observations

I am attempting to run a mixed effect model (using R) on some data but struggling with one of the fixed effects, I think primarily due to it a factor?!

sample data:

``````data4<-structure(list(code = structure(1:10, .Label = c("10888", "10889",
"10890", "10891", "10892", "10893", "10894", "10896", "10897",
"10898", "10899", "10900", "10901", "10902", "10903", "10904",
"10905", "10906", "10907", "10908", "10909", "10910", "10914",
"10916", "10917", "10919", "10920", "10922", "10923", "10924",
"10925", "10927"), class = "factor"), speed = c(0.0296315046039244,
0.0366986630049636, 0.0294297725505692, 0.048316183511095, 0.0294275666501456,
0.199924957584131, 0.0798850288176711, 0.0445886457047146, 0.0285993712316451,
0.0715158276875623), meanflow = c(0.657410742496051, 0.608271363339857,
0.663241108786611, 0.538259450171821, 0.666299529534762, 0.507156583629893,
0.762448863636364, 37.6559178370787, 50.8557196935557, 31.6601587837838
), length = c(136, 157, 132, 140, 135, 134, 144, 149, 139, 165
), river = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L
), .Label = c("c", "f"), class = "factor")), .Names = c("code",
"speed", "meanflow", "length", "river"), row.names = c(2L, 4L,
6L, 8L, 10L, 12L, 14L, 16L, 18L, 20L), class = "data.frame")
``````

my model is as such:

``````model1<-lmer(speed ~ river + length +(1|meanflow)+(1|code), data4)
``````

and when run returns error message:

``````Error in checkNlevels(reTrms\$flist, n = n, control) :
number of levels of each grouping factor must be < number of observations
``````

having trawled the internet I have found one response:

http://stats.stackexchange.com/questions/57912/random-effects-must-be-less-than-the-number-of-observations-error-in-lmer-pac

but for the life of me do not understand the responses to the question!

a point in the right direction will be more than welcome

You have two problems here:

• It looks like you have one observation for every value of `code`. That means that you can't estimate both a residual variance (which is built in to `lmer`, and linear mixed models more generally) and an among-`code` variance -- both of these parameters will be trying to estimate the same variance component, and any combination of `var(residual)` and `var(code)` that adds up to the same value will represent an equally good fit to the data.

• You also have one observation for every value of `meanflow`; this is because `meanflow` is a continuous variable, which is not usually something you want to use as a grouping variable in the model. I'm not sure what you're trying to capture with this term.

You can actually fit these models if you insist by using `lmerControl` to bypass the checks, but you won't necessarily get a sensible result!

``````model2 <- lmer(speed ~ river + length +(1|meanflow)+(1|code), data4,
control=lmerControl(check.nobs.vs.nlev = "ignore",
check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nRE="ignore"))
``````

Here the variance has been divided approximately in equal thirds:

`````` VarCorr(model2)
##  Groups   Name        Std.Dev.
##  meanflow (Intercept) 0.035354
##  code     (Intercept) 0.032898
##  Residual             0.033590
``````

If we use only one (still inappropriate) random effect,

``````model0 <- lmer(speed ~ river + length +(1|meanflow), data4,
control=lmerControl(check.nobs.vs.nlev = "ignore",
check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nRE="ignore"))
``````

Now the variance is divided exactly in halves:

``````VarCorr(model0)
##  Groups   Name        Std.Dev.
##  meanflow (Intercept) 0.041596
##  Residual             0.041596
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download