rocketman - 1 month ago 4

R Question

With every iteration of the loop, I'd like to fit a linear model using more historical data and see how, for example, the one-step ahead prediction compares to the actual. The code should be self-explanatory. The problem seems to be that Dependent and Independent are fixed in size after the first iteration (which I'd like to start at 10 data points, as shown in the code), whereas I'd like them to be dynamically sized.

`output1 <- rep(0, 127)`

output2 <- rep(0, 127)

ret <- function(x, y)

{

for (i in 1:127)

{

Dependent <- y[1:(9+i)]

Independent <- x[1:(9+i)]

fit <- lm(Dependent ~ Independent)

nextInput <- data.frame(Independent = x[(10+i)])

prediction <- predict(fit, nextInput, interval="prediction")

output1[i] <- prediction[2]

output2[i] <- prediction[3]

}

}

Answer

Here's a thought, let me know if I'm close to your intent:

```
set.seed(42)
n <- 100
x <- rnorm(n)
head(x)
# [1] 1.3709584 -0.5646982 0.3631284 0.6328626 0.4042683 -0.1061245
y <- runif(n)
head(y)
# [1] 0.8851177 0.5171111 0.8519310 0.4427963 0.1578801 0.4423246
ret <- lapply(10:n, function(i) {
dep <- y[1:i]
indep <- x[1:i]
fit <- lm(dep ~ indep)
pred <-
if (i < n) {
predict(fit, data.frame(indep = x[i+1L]), interval = "prediction")
} else NULL
list(fit = fit, pred = pred)
})
```

Note that I'm making a list of models/predictions instead of using a `for`

loop. Though not exactly the same, this answer does a decent job explaining why this may be a good idea.

Model and prediction from one of the runs:

```
ret[[50]]
# $fit
# Call:
# lm(formula = dep ~ indep)
# Coefficients:
# (Intercept) indep
# 0.44522 0.02691
# $pred
# fit lwr upr
# 1 0.4528911 -0.1160787 1.021861
summary(ret[[50]]$fit)
# Call:
# lm(formula = dep ~ indep)
# Residuals:
# Min 1Q Median 3Q Max
# -0.42619 -0.22178 -0.00004 0.15550 0.53774
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.44522 0.03667 12.141 <2e-16 ***
# indep 0.02691 0.03186 0.845 0.402
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Residual standard error: 0.2816 on 57 degrees of freedom
# Multiple R-squared: 0.01236, Adjusted R-squared: -0.004966
# F-statistic: 0.7134 on 1 and 57 DF, p-value: 0.4018
```

Source (Stackoverflow)

Comments