Pritish Kakodkar Pritish Kakodkar - 1 year ago 76
R Question

Selecting the statistically significant variables in an R glm model

I have an outcome variable, say Y and a list of 100 dimensions that could affect Y (say X1...X100).

After running my

and viewing a summary of my model, I see those variables that are statistically significant. I would like to be able to select those variables and run another model and compare performance. Is there a way I can parse the model summary and select only the ones that are significant?

Answer Source

You can get access the pvalues of the glm result through the function "summary". The last column of the coefficients matrix is called "Pr(>|t|)" and holds the pvalues of the factors used in the model.

Here's an example:

#x is a 10 x 3 matrix
x = matrix(rnorm(3*10), ncol=3)
y = rnorm(10)
res = glm(y~x)
#ignore the intercept pval
summary(res)$coeff[-1,4] < 0.05
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download