Matt Bannert - 16 days ago 4x
R Question

# How to force R to use a specified factor level as reference in a regression?

Somehow I can´t find it in my notes... nor do find the obivous on the net. How can I tell R to use a certain level as reference if I use dummy explanatories in a regression?
It´s just using some level by default.

``````lm(x ~ y + as.factor(b))
``````

with b {0,1,2,3,4} . Let´s say I want to use 3 instead of the zero that is used by R.

Thx in advance !

Answer

See the `relevel()` function. Here is an example:

``````set.seed(123)
x <- rnorm(100)
DF <- data.frame(x = x,
y = 4 + (1.5*x) + rnorm(100, sd = 2),
b = gl(5, 20))
head(DF)
str(DF)

m1 <- lm(y ~ x + b, data = DF)
summary(m1)
``````

Now alter the factor `b` in `DF` by use of the `relevel()` function:

``````DF <- within(DF, b <- relevel(b, ref = 3))
m2 <- lm(y ~ x + b, data = DF)
summary(m2)
``````

The models have estimated different reference levels.

``````> coef(m1)
(Intercept)           x          b2          b3          b4          b5
3.2903239   1.4358520   0.6296896   0.3698343   1.0357633   0.4666219
> coef(m2)
(Intercept)           x          b1          b2          b4          b5
3.66015826  1.43585196 -0.36983433  0.25985529  0.66592898  0.09678759
``````
Comments