aldo_tapia aldo_tapia - 25 days ago 3
R Question

Linear fit without slope in r

I want to fit a linear model with no slope and extract information of it. My objective is to know which is the best y-intercept for an horizontal line in a data set and also evaluate R2 from derived linear fit to identify if y has a particular behavior (x is date). I've using

range
to evaluate behavior, but I'm looking for an index without unit.

Removing y-intercept:

X <- 1:10

Y <- 2:11

lm1 <- lm(Y~X + 0, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 1

lm1 <- lm(Y~X - 1, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 2

lm1 <- lm(Y~0 + X, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 3

lm1$coefficients
X
1.142857

summary(lm1)$r.squared
[1] 0.9957567


All the
lm
showed before, has R2. But, if I evaluate:

lm2 <- lm(Y~1, data = data.frame(X=X,Y=Y))

lm2$coefficients
(Intercept)
6.5

summary(lm2)$r.squared
[1] 0


There is a way to calculate R2 out of
lm
function or calculate an index to identify how much y is represented by an horizontal line?

Answer

Let lmObject be your linear model returned by lm (called with y = TRUE to return y).

  • If your model has intercept, then R-squared is computed as

    with(lmObject, 1 - c(crossprod(residuals) / crossprod(y - mean(y))) )
    
  • If your model does not have an intercept, then R-squared is computed as

    with(lmObject, 1 - c(crossprod(residuals) / crossprod(y)) )
    

Note, if your model is only an intercept (so it is certainly from the 1st case above), you have

residuals = y - mean(y)

thus R-squared is always 1 - 1 = 0.

In regression analysis, it is always recommended to include intercept in the model to get unbiased estimate. A model with intercept only is the NULL model. Any other model is compared with this NULL model for further analysis of variance.


A note. The value / quantity you want has nothing to do with regression. You can simply compute it as

c(crossprod(Y - mean(Y)) / crossprod(Y))  ## `Y` is your data
#[1] 0.1633663

Alternatively, use

(length(Y) - 1) * var(Y) / c(crossprod(Y))
#[1] 0.1633663