tanya singh tanya singh - 9 months ago 31
R Question

R code if my random effect is nested under another random effect

I am trying to do mixed linear model for my study in R. I would like to know if my code is correct or not.
MY design - I have 5 sites, 2 subsites within each site and 2 permanent quadrates within each site.
So I have 5 sites, 10 subsites and 20 quadrats. I have measured colony size (of corals) at all the quadrats.
My question is does the size structure vary between sites ?
In my data quadrats are nested within subsite and subsites are nested within site.
I will use site as my fixed factor and subsites and quadrats as my random effects.
I can think of two possible ways of doing this:


option 1
lmer(size ~ site + (1|subsite) + (1|quadrat)

option 2
lmer(size ~ site + (1|site:subsite) + (1|subsite:quadrant)

which one of these would be correct to use?


Answer Source

It depends a bit on how your subsites and quadrats are coded. Let's consider two schemes.

explicit nesting: this means that the subsites within sites and quadrats within subsites don't have unique names, e.g.

site subsite quadrat
A    a       1
A    a       2
A    b       1
A    b       2
B    a       1
B    a       2
... etc.

In this case, you must use interaction/nesting syntax to let R know that quadrat 1 in site A, subsite a has nothing in common with all of the other quadrats labeled "1" ...

size ~ site + (1|site:subsite) + (1|site:subsite:quadrat)

(size ~ site + (1|site:(subsite/quadrat)) might work, but I haven't tested it)

implicit nesting: in this case, everything is uniquely named.

site subsite quadrat
A    Aa      Aa1
A    Aa      Aa2
A    Ab      Ab1
A    Ab      Ab2
B    Ba      Ba1
B    Ba      Ba2
... etc.

In this case, you can use either the syntax above (R automatically drops the redundant levels) or

size ~ site + (1|subsite) + (1|quadrat)

and you should get identical results. (You can always test this experimentally!)

A couple of other points:

  • in general I recommend unique labels/implicit nesting (explicit nesting may be more convenient for humans gathering data on field notes, but you should convert to implicit nesting early in your data cleaning process), because it slightly reduces the chances of error
  • I always recommend using the data argument with lme4
  • if you don't care about quantifying within-site variation, and if your design is balanced, and your data are Normal (i.e. you're using lmer and not glmer) you can greatly simplify your life by simply aggregating to the mean values per site and running a 1-way ANOVA (see Murtaugh 2007, Ecology, "Simplicity and complexity in ecological data analysis").