I have set of Temperature and Discomfort index value for each temperature data. When I plot a graph between temperature(x axis) and Calculated Discomfort index value( y axis) I get a reversed U-shape curve. I want to do non linear regression out of it and convert it into PMML model. My aim is to get the predicted discomfort value if I give certain temperature.
Please find the below dataset :
Temp <- c(0,5,10,6 ,9,13,15,16,20,21,24,26,29,30,32,34,36,38,40,43,44,45, 50,60)
I did take a look at this, then I think it is not as simple as using
nls as most of us first thought.
nls fits a parametric model, but from your data (the scatter plot), it is hard to propose a reasonable model assumption. I would suggest using non-parametric smoothing for this.
There are many scatter plot smoothing methods, like kernel smoothing
ksmooth, smoothing spline
smooth.spline and LOESS
loess. I prefer to using
smooth.spline, and here is what we can do with it:
fit <- smooth.spline(Temp, Disc)
?smooth.spline for what it takes and what it returns. We can check the fitted spline curve by
plot(Temp, Disc) lines(fit, col = 2)
Should you want to make prediction elsewhere, use
predict function (
predict.smooth.spline). For example, if we want to predict
Temp = 20 and
Temp = 44, we can use
predict(fit, c(20,44))$y #  0.3940963 0.3752191
range(Temp) is not recommended, as it suffers from potential bad extrapolation effect.
Before I resort to non-parametric method, I also tried non-linear regression with regression splines and orthogonal polynomial basis, but they don't provide satisfying result. The major reason is that there is no penalty on the smoothness. As an example, I show some try with
try1 <- lm(Disc ~ poly(Temp, degree = 3)) try2 <- lm(Disc ~ poly(Temp, degree = 4)) try3 <- lm(Disc ~ poly(Temp, degree = 5)) plot(Temp, Disc, ylim = c(-0.3,1.0)) x<- seq(min(Temp), max(Temp), length = 50) newdat <- list(Temp = x) lines(x, predict(try1, newdat), col = 2) lines(x, predict(try2, newdat), col = 3) lines(x, predict(try3, newdat), col = 4)
We can see that the fitted curve is artificial.