Mayou - 3 months ago 13

R Question

I am trying to use LASSO for variable selection, and attempted the implementation in R using the

`glmnet`

`set.seed(1)`

library(glmnet)

return = matrix(ret.ff.zoo[which(index(ret.ff.zoo) == beta.df$date[1]),])

data = matrix(unlist(beta.df[which(beta.df$date == beta.df$date[1]),][,-1]), ncol = num.factors)

dimnames(data)[[2]] <- names(beta.df)[-1]

model <- cv.glmnet(data, return, standardize = TRUE)

coef(model)

This is what I obtain when I run it the first time:

`> coef(model)`

15 x 1 sparse Matrix of class "dgCMatrix"

1

(Intercept) 0.009159452

VAL .

EQ .

EFF .

SIZE 0.018479078

MOM .

FSCR .

MSCR .

SY .

URP .

UMP .

UNIF .

OIL .

DEI .

PROD .

BUT, this is what I obtain when I run the SAME code once more:

`> coef(model)`

15 x 1 sparse Matrix of class "dgCMatrix"

1

(Intercept) 0.008031915

VAL .

EQ .

EFF .

SIZE 0.021250778

MOM .

FSCR .

MSCR .

SY .

URP .

UMP .

UNIF .

OIL .

DEI .

PROD .

I am not sure why the model behaves this way. How would I be able to choose a final model if the coefficients change at every run? Does it use a different tuning parameter $\lambda$ at every run? I thought that

`cv.glmnet`

`model$lambda.1se`

I have just started learning about this package, and would appreciate any help I can get!

Thank you!

Answer

The model isn't deterministic. Run `set.seed(1)`

before your model fit to produce deterministic results.