Max - 4 months ago 29

R Question

Suppose, there is some data.frame **foo_data_frame** and one wants to find regression of the target column **Y** by some others columns. For that purpose usualy some formula and model are used. For example:

`linear_model <- lm(Y~FACTOR_NAME_1+FACTOR_NAME_2, data_frame)`

That does job well if the formula is coded staticaly. If it is desired to root over several models with the constant number of dependent variables (say, 2) it can be treated like that:

`for (i in 1:factor_number) {`

for (j in (i+1):factor_number) {

linear_model <- lm(Y~F1+F2, list(Y=data_frame$Y, F1=data_frame[[i]], F2=data_frame[[j]]))

# linear_model further analyzing...

}

}

My question is how to do the same affect when the number of variables is changing dynamicly during program running?

`for (number_of_factors in 1:5) {`

# then root over subsets with #number_of_factors cardinality

for (factors_subset in all_subsets_with_fixed_cardinality) {

# here I want to fit model with factors from factors_subset.

linear_model <- lm(Does R provide smth to write here?)

}

}

Thank you.

Answer

See ?as.formula, eg:

```
> listoffactors <- c("factor1","factor2")
> as.formula(paste("y~",paste(listoffactors,collapse="+")))
y ~ factor1 + factor2
```

where listoffactors is a character vector containing the names of the factors you want to use in the model. This you can paste into an lm model, eg :

```
> y <- rnorm(100)
> factor1 <- rep(1:2,each=50)
> factor2 <- rep(3:4,50)
> lm( as.formula(paste("y~",paste(listoffactors,collapse="+"))))
Call:
lm(formula = as.formula(paste("y~", paste(listoffactors, collapse = "+"))))
Coefficients:
(Intercept) factor1 factor2
-1.4443 0.2052 0.3022
```