ieaggie ieaggie - 1 month ago 11
R Question

Multiple Linear Reg-Applying Polynomial Terms to Interaction Effects

I've been wracking my brains over this and cannot figure out what the best method to go about this is.

So far, I conducted my initial MLR

reg=lm(register~.,data=train)


From there, I checked for interaction effects using

testinter=glm(register~(.-atemp)*(.-atemp),data=train)


After Determining the Significant Interactions, I included all in my model.

reg=lm(register~season:month+season:temp+year:month+
year:weekday+year:temp+month:temp+holiday:windspeed+weekday:
weathersit+weekday:hum+weathersit:hum+
hum+weathersit+season+year+temp, data=train)


However, after looking at some of the variables, hum and temp needed to be transformed to polynomial terms. hum^2 and temp^3.

My question is how can I include these in the interaction effects?
This is my attempt so far, where I switched out "hum" with poly(hum,2,raw=T) but Im not sure if its correct.

reg=lm(register~season:month+season:poly(temp,3,raw=T)+year:month+
year:weekday+year:poly(temp,3,raw=T)+month:poly(temp,3,raw=T)+holiday:windspeed+weekday:
weathersit+weekday:poly(hum,2,raw=T)+weathersit:poly(hum,2,raw=T)+
poly(hum,2,raw=T)+weathersit+season+year+poly(temp,3,raw=T), data=train)

Answer

You should use I() when you need to do a transformation within an lm or glm equation.

Here's an example using the iris dataset:

data(iris)
reg <- lm(Sepal.Length~Sepal.Width:I(Petal.Length^2), data=iris)
summary(reg)
Call:
lm(formula = Sepal.Length ~ Sepal.Width:I(Petal.Length^2), data = iris)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.9405 -0.2357  0.0029  0.2224  0.8241 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                   4.8649126  0.0478357  101.70   <2e-16 ***
Sepal.Width:I(Petal.Length^2) 0.0192699  0.0007492   25.72   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3552 on 148 degrees of freedom
Multiple R-squared:  0.8172,  Adjusted R-squared:  0.816 
F-statistic: 661.6 on 1 and 148 DF,  p-value: < 2.2e-16
Comments