Sarah Sarah - 1 month ago 8
R Question

Selecting columns in a data.frame to implement in a model

Is there a way to run a model (for simplicity, a linear model) using specified columns of a data.frame?

For example, I would like to be able to do something like this:

set.seed(1)
ET = runif(10, 1,20)
x1 = runif(10, 1,20)
x2 = runif(10, 1,30)
x3 = runif(10, 1,40)

Xdf = data.frame(ET = ET, X1 = x1, X2 =x2, X3 = x3)

lm(ET~Xdf[,c(2,3)], data = Xdf)


Where the linear model would be equal to
lm(ET~X1 +X2, data = Xdf)


I have tried with a matrix - but it won't work in this case as I will eventually be adding spatial correlation based upon values stored in the data.frame that need to be specified by the data = data.frame call.As well as having certain names.frame. As well, I need to be able to choose certain columns in the data because this will be looping through multiple models using different predictors.

Any help would be greatly appreciated. Thanks!

Answer Source

Here's a (rather ugly) way to make it work.

I use as.formula and the paste function to make the formula before calling lm.

I'm sure there are better ways to do this, but this is what came to mind.

# ET ~ X1 + X2
f1 <- as.formula(paste("ET~", paste(names(Xdf)[c(2,3)], 
                                        collapse="+")))
lm(f1, data=Xdf)

You can also specify the individual columns, though it might be more work

lm(ET ~ Xdf[,2] + Xdf[,3], data=Xdf)