Mads Stehr - 6 months ago 53

R Question

I am currently trying to model the interaction between a covariate

`x`

`gamm`

`mgcv`

`M0`

- When trying to nest the models properly I would like to get out the smoothing parameter for the 0-gender smooth from , and use it in the simpler model specification. But I get the following error message:
`M0`

Error in gamm.setup(gp, pterms = pTerms, data = mf, knots = knots, parametric.only = FALSE, :

gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)

- Second question is the more stupid one. Are the models even nested when I go from a smooth for each gender to a 0-gender smooth and a linear difference to 1-gender?

Below follows an example. I simulated some random data, so the data does not display the behaviour of my actual data, but the problems remain the same.

`library(mgcv)`

### Simulate random data

x <- rnorm(100, mean = 10, sd = 1.5)

y <- rnorm(100, mean = 1, sd = 0.025)

id <- sample(1:10, size = 100, replace = T)

id <- as.factor(id)

gender <- sample(c(0,1), size = 100, replace = T)

### Specify main model, M0

ctrl <- list(niterEM=0,optimMethod="L-BFGS-B", msMaxIter = 100)

M0 <- gamm(y~s(x, by = as.factor(gender)) + gender,

random=list(id=~1+x), control=ctrl)

### Specify model with linear difference between gender0 and gender1

M1 <- gamm(y~s(x) + gender:x + gender,

random=list(id=~1+x), control=ctrl)

### Testing

anova(M0$lme, M1$lme)

### Problems:

sp0 <- M0$gam$sp[1]

M1 <- gamm(y~s(x, sp = sp0) + gender:x + gender,

random=list(id=~1+x), control=ctrl)

Error in gamm.setup(gp, pterms = pTerms, data = mf, knots = knots, parametric.only = FALSE, :

`gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)`

Any thoughts? Thanks in advance.

Answer

`gamm`

errorThis is a very interesting thing... Well, I should explain the logic first.

In principle **it is illegal to fix a smoothing parameter in gamm**, because

`gamm`

will treat the wiggly components of the smooth as random effects, the variance of which to be estimated by `lme`

(as you have Gaussian data). If you try to fix the smoothing parameter, that is essentially saying that you want to fix the variance of the random effect. Well, `lme`

does not allow you to do this (and I doubt whether such attempt is legal in mixed modelling, either.)Therefore `gamm`

would disable any possible constraints on smoothing parameters, including:

- lower bound of smoothing parameter, via
`min.sp`

; - linked smooth sharing the same smoothing parameter, via
`id`

in`s()`

; - directly specified smoothing parameter via
`sp`

, via`s()`

.

The first two are perfectly checked. `gamm`

has no `min.sp`

argument like `gam`

; even if you specify it via `...`

, there is no chance it would get used (as later it's `NULL`

that's passed to `gam.setup`

during `gamm.setup`

, so your specified `min.sp`

is ignored). Specification of `id`

will also be detected by the error message you saw, but of course you did not specify `id`

so the above error is not reporting the correct issue here, hence a bug.

The third one, actually has not been directly checked via `gamm`

. Ideally, as soon as the `gamm`

/ `gam`

formula has been interpreted (by `interpret.gam`

), `sp`

should be reset to `-1`

if it is not readily is, and a warning message about this should be issued. However, this part is missing. Therefore, at the moment the best thing you can do is just not specifying `sp`

.

Now let's turn to your concern on nesting. **Nesting is related to basis set up rather than choice of smoothing parameter.** As long as you have the same set of basis (same basis type, same number and / or location of "knots"), the model matrix will be the same. For example, in your model `M0`

and `M1`

, you have the same configuration of the `s()`

with `mgcv`

default `bs = 'tp', k = 10`

. So the design matrix for `s()`

is the same in your two models. The `by = factor(gender)`

just replicates this `s()`

to all levels of `gender`

in your main model `M0`

. Perhaps it is not easily seen, but actually your `M1`

is nested in `M0`

.

Let's consider a small example to verify this. For simplicity, I will not use `s(x)`

from `mgcv`

, but use `poly(x, degree = 2)`

(imagining it is `s(x)`

). Let's generate some toy data first:

```
x <- 1:10
f <- gl(2, 5, labels = c("M", "F"))
```

Since `f`

is not an ordered factor, `s(x, by = factor(f))`

generates design matrix by replicating `s(x)`

for all levels of `f`

:

```
## original design matrix for `s(x)`
X0 <- poly(x, 2)
## design matrix for `f`, without contrasting
Xf <- model.matrix(~ f + 0)
## design matrix for level `M`
X1 <- X0 * Xf[, 1]
## design matrix for level `F`
X2 <- X0 * Xf[, 2]
## design matrix for `s(x, by = f)` "please, imagine it as `poly`"
X <- cbind(X1, X2)
# 1 2 1 2
# [1,] -0.49543369 0.52223297 0.00000000 0.00000000
# [2,] -0.38533732 0.17407766 0.00000000 0.00000000
# [3,] -0.27524094 -0.08703883 0.00000000 0.00000000
# [4,] -0.16514456 -0.26111648 0.00000000 0.00000000
# [5,] -0.05504819 -0.34815531 0.00000000 0.00000000
# [6,] 0.00000000 0.00000000 0.05504819 -0.34815531
# [7,] 0.00000000 0.00000000 0.16514456 -0.26111648
# [8,] 0.00000000 0.00000000 0.27524094 -0.08703883
# [9,] 0.00000000 0.00000000 0.38533732 0.17407766
#[10,] 0.00000000 0.00000000 0.49543369 0.52223297
```

Your second model `M1`

, has only a smooth term `s(x)`

, the design matrix of which is `X0`

.

Here is how we can see that your `M1`

is nested in `M0`

:

- As
`X1 + X2 = X0`

, design matrix of`s(x)`

and`s(x, by = f)`

have the same column span, thus`s(x)`

is nested in`s(x, by = f)`

; - since
`x`

is nested in`s(x)`

,`x:f`

is nested in`s(x, by = f)`

.

Although you models are already nicely nested, the main model `M0`

does not have a comparable interpretation with your `M1`

. Your main model `M0`

will end up with an independent smooth for each level, while your `M1`

focus on the difference between two groups.

It would be good if we could control `M0`

to admit a form of "reference smooth + difference smooth". Then, if the difference smooth turns out a line, without actually fitting `M1`

we already know there is no evidence for non-linear difference between groups.

In `mgcv`

, difference smooth will be constructed, if your factor is ordered. So I suggest you fit your main model by:

```
gender1 <- ordered(gender) ## create an ordered factor
s(x) + s(x, by = gender1) + gender
```

If estimation result shows the difference smooth `s(x, by = gender1)`

as a line, you know you can instead fit a simpler model

```
s(x) + gender:x + gender
```

even without using `F-test`

.

Note, it is very important to have an ordered factor `by`

in order to construct "difference" smooth. If you do

```
s(x) + s(x, by = gender) + gender ## note, it is "gender" in "by"
```

`s(x)`

and `s(x, by = gender)`

are completely linearly dependent. The resulting model matrix will be **rank-deficient**.