REnthusiast - 5 months ago 29

R Question

When I try to define my linear model in R as follows:

`lm1 <- lm(predictorvariable ~ x1+x2+x3, data=dataframe.df)`

I get the following error message:

`Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :`

contrasts can be applied only to factors with 2 or more levels

Is there any way to ignore this or fix it? Some of the variables are factors and some are not.

Answer

If your independent variable (RHS variable) is a factor or a character taking only one value then that type of error occurs.

Example: iris data in R

```
model1<-lm(Sepal.Length~Sepal.Width+Species,data=iris)
> model1
Call:
lm(formula = Sepal.Length ~ Sepal.Width + Species, data = iris)
Coefficients:
(Intercept) Sepal.Width Speciesversicolor Speciesvirginica
2.2514 0.8036 1.4587 1.9468
```

Now, if your data consists of only one species:

```
model1<-lm(Sepal.Length~Sepal.Width+Species,data=iris[iris$Species=="setosa",])
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
```

If the variable is numeric (Sepal.Width) but taking only a single value say 3, then the model runs but you will get NA as coefficient of that variable as follows:

```
model2<-lm(Sepal.Length~Sepal.Width+Species,data=iris[iris$Sepal.Width==3,])
> model2
Call:
lm(formula = Sepal.Length ~ Sepal.Width + Species, data = iris[iris$Sepal.Width ==
3, ])
Coefficients:
(Intercept) Sepal.Width Speciesversicolor Speciesvirginica
4.700 NA 1.250 2.017
```

**Solution**: There is no enough variation in dependent variable with only one value. So, you need to drop that variable, irrespective of whether that is numeric or character or factor variable.

**Updated as per comments:** Since you know that the error will only occur with factor/character, you can focus only on those and see whether the length of levels of those factor variables is 1 (DROP) or greater than 1 (NODROP).

To see, whether the variable is a factor or not, use the following code:

```
l<-sapply(iris,function(x)is.factor(x))
>l
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
FALSE FALSE FALSE FALSE TRUE
```

Then you can get the data frame of factor variables only

```
m<-iris[,names(which(l=="TRUE"))]
```

Now, find the number of levels of factor variables, if this is one you need to drop that

```
ifelse(n<-sapply(m,function(x)length(levels(x)))==1,"DROP","NODROP")
```

Note: If the levels of factor variable is only one then that is the variable, you have to drop.