Peter S - 4 months ago 56

R Question

I have two questions about prediction using GLMNET - specifically about the intercept.

I made a small example of train data creation, GLMNET estimation and prediction on the train data (which I will later change to Test data):

`# Train data creation`

Train <- data.frame('x1'=runif(10), 'x2'=runif(10))

Train$y <- Train$x1-Train$x2+runif(10)

# From Train data frame to x and y matrix

y <- Train$y

x <- as.matrix(Train[,c('x1','x2')])

# Glmnet model

Model_El <- glmnet(x,y)

Cv_El <- cv.glmnet(x,y)

# Prediction

Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]

Test_Matrix_Df <- data.frame(Test_Matrix)

Pred_El <- predict(Model_El,newx=Test_Matrix,s=Cv_El$lambda.min,type='response')

I want to have an intercept in the estimated formula. This code gives an error concerning the dimensions of the Test_Matrix matrix unless I remove the (Intercept) column of the matrix - as in

`Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]`

My questions are:

- Is it the right way to do this in order to get the prediction - when I want the prediction formula to include the intercept?
- If it is the right way: Why do I have to remove the intercept in the matrix?

Thanks in advance.

Answer

If you want to predict a model with intercept, you have to fit a model with intercept. Your code used model matrix `x <- as.matrix(Train[,c('x1','x2')])`

which is intercept-free, therefore if you provide an intercept when using `predict`

, you get an error.

You can do the following:

```
x <- model.matrix(y ~ ., Train) ## model matrix with intercept
Model_El <- glmnet(x,y)
Cv_El <- cv.glmnet(x,y)
Test_Matrix <- model.matrix(y ~ ., Train) ## prediction matrix with intercept
Pred_El <- predict(Model_El, newx = Test_Matrix, s = Cv_El$lambda.min, type='response')
```

Note, you don't have to do

```
model.matrix(~ . -y)
```

`model.matrix`

will ignore the LHS of the formula, so it is legitimate to use

```
model.matrix(y ~ .)
```