ZzKr - 2 years ago 306
R Question

# plot.lm Error: \$ operator is invalid for atomic vectors

I have the following regression model with transformations:

``````expo <- 3
fit <- lm( I(NewValue^(1/expo)) ~ I(CurrentValue^(1/expo)) + Age + Type -1,
data=dataReg)
plot(fit)
``````

But plot gives me the following error:

``````Error: \$ operator is invalid for atomic vectors
``````

It might have to do with plot calling
`\$`
on my data.frame
`dataReg`
, but I cannot figure out how to prevent it. Any ideas about what I am doing wrong?

Note: the regression model works correct and I can call
`summary`
,
`predict`
, and
`resid`
correctly.

This is actually quite a interesting observation. In fact, among all 6 plots supported by `plot.lm`, only the Q-Q plot fails in this case. Consider the following reproducible example:

``````x <- runif(20)
y <- runif(20)
fit <- lm(I(y ^ (1/3)) ~ I(x ^ (1/3)))
## only `which = 2L` (QQ plot) fails; `which = 1, 3, 4, 5, 6` all work
stats:::plot.lm(fit, which = 2L)
``````

Inside `plot.lm`, the Q-Q plot is simply produced as follow:

``````rs <- rstandard(fit)  ## standardised residuals
qqnorm(rs)  ## fine
## inside `qqline(rs)`
yy <- quantile(rs, c(0.25, 0.75))
xx <- qnorm(c(0.25, 0.75))
slope <- diff(yy)/diff(xx)
int <- yy[1L] - slope * xx[1L]
abline(int, slope)  ## this fails!!!
``````

Error: \$ operator is invalid for atomic vectors

So this is purely a problem of `abline` function! Note:

``````is.object(int)
# [1] TRUE

is.object(slope)
# [1] TRUE
``````

i.e., both `int` and `slope` has class attribute (read `?is.object`; it is a very efficient way to check whether an object has class attribute). What class?

``````class(int)
# [1] AsIs

class(slope)
# [1] AsIs
``````

This is the result of using `I()`. Precisely, they inherits such class from `rs` and further from the response variable. That is, if we use `I()` on response, the RHS of the model formula, we get this behaviour.

You can do a few experiment here:

``````abline(as.numeric(int), as.numeric(slope))  ## OK
abline(as.numeric(int), slope)  ## OK
abline(int, as.numeric(slope))  ## fails!!
abline(int, slope)  ## fails!!
``````

So `abline(a, b)` is very sensitive to whether the first argument `a` has class attribute or not.

Why? Because `abline` can accept a linear model object with "lm" class. Inside `abline`:

``````if (is.object(a) || is.list(a)) {
p <- length(coefa <- as.vector(coef(a)))
``````

If `a` has a class, `abline` is assuming it as a model object (regardless whether it is really is!!!), then try to use `coef` to obtain coefficients. The check being done here is fairly not robust; we can make `abline` fail rather easily:

``````plot(0:1, 0:1)
a <- 0  ## plain numeric
abline(a, 1)  ## OK
class(a) <- "whatever"  ## add a class
abline(a, 1)  ## oops, fails!!!
``````

Error: \$ operator is invalid for atomic vectors

So here is the conclusion: avoid using `I()` on your response variable in the model formula. It is OK to have `I()` on covariates, but not on response. `lm` and most generic functions won't have trouble dealing with this, but `plot.lm` will.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download