Jared Jared - 2 months ago 27
R Question

All Levels of a Factor in a Model Matrix in R

I have a

data.frame
consisting of numeric and factor variables as seen below.

testFrame <- data.frame(First=sample(1:10, 20, replace=T),
Second=sample(1:20, 20, replace=T), Third=sample(1:10, 20, replace=T),
Fourth=rep(c("Alice","Bob","Charlie","David"), 5),
Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4))


I want to build out a
matrix
that assigns dummy variables to the factor and leaves the numeric variables alone.

model.matrix(~ First + Second + Third + Fourth + Fifth, data=testFrame)


As expected when running
lm
this leaves out one level of each factor as the reference level. However, I want to build out a
matrix
with a dummy/indicator variable for every level of all the factors. I am building this matrix for
glmnet
so I am not worried about multicollinearity.

Is there a way to have
model.matrix
create the dummy for every level of the factor?

Answer

You need to reset the contrasts for the factor variables:

model.matrix(~ Fourth + Fifth, data=testFrame, 
        contrasts.arg=list(Fourth=contrasts(testFrame$Fourth, contrasts=F), 
                Fifth=contrasts(testFrame$Fifth, contrasts=F)))

or, with a little less typing and without the proper names:

model.matrix(~ Fourth + Fifth, data=testFrame, 
    contrasts.arg=list(Fourth=diag(nlevels(testFrame$Fourth)), 
            Fifth=diag(nlevels(testFrame$Fifth))))
Comments