Mushrambo Mushrambo - 1 month ago 12
R Question

Error when fitting ARIMA model to my multiple time series and storing forecast into a matrix

As it is large I can't

dput
it here. But suppose the
realmatrix
is a "mts" with non-trivial values

realmatrix <- matrix(NA, ncol = 100, nrow = 138)


In fact it stores 100 time series with length (rows) = 138 (from Jan 2005 to June 2016).

I want to store the
Arima
forecasts (12 months ahead: that is, from July 2016 to June 2017) in another matrix
farimamatrix
(which should have 12 rows and 100 columns), via the following loop:

farimamatrix <- matrix(NA, nrow = 12, ncol = 100)

m <- k <- list()

for (i in 1:100) {
try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
k[[i]] <- forecast.Arima(m[[i]], h=12)
farimamatrix[,i] <- fitted(k[[i]])
}


But I am getting the following message:


Error in farimamatrix[, i] <- fitted(k[[i]]) :

incorrect number of subscripts on matrix


What's wrong? Thanks in advance.




Edited (24/10): updated / corrected under Zheyuan's answer and previous problem gone

Original data:

tsdata <-
structure(c(28220L, 27699L, 28445L, 29207L, 28482L, 28326L, 28322L,
28611L, 29187L, 29145L, 29288L, 29352L, 28881L, 29383L, 29898L,
29888L, 28925L, 29069L, 29114L, 29886L, 29917L, 30144L, 30531L,
30494L, 30700L, 30325L, 31313L, 32031L, 31383L, 30767L, 30500L,
31181L, 31736L, 32136L, 32654L, 32305L, 31856L, 31731L, 32119L,
31953L, 32300L, 31743L, 32150L, 33014L, 32964L, 33674L, 33410L,
31559L, 30667L, 30495L, 31978L, 32043L, 30945L, 30715L, 31325L,
32262L, 32717L, 33420L, 33617L, 34123L, 33362L, 33731L, 35118L,
35027L, 34298L, 34171L, 33851L, 34715L, 35184L, 35190L, 35079L,
35958L, 35875L, 35446L, 36352L, 36050L, 35567L, 35161L, 35419L,
36337L, 36967L, 36745L, 36370L, 36744L, 36303L, 36899L, 38621L,
37994L, 36809L, 36527L, 35916L, 37178L, 37661L, 37794L, 38642L,
37763L, 38367L, 38006L, 38442L, 38654L, 38345L, 37628L, 37698L,
38613L, 38525L, 39389L, 39920L, 39556L, 40280L, 41653L, 40269L,
39592L, 39100L, 37726L, 37867L, 38551L, 38895L, 40100L, 40950L,
39838L, 40643L, 40611L, 39611L, 39445L, 38059L, 37131L, 36697L,
37746L, 37733L, 39188L, 39127L, 38554L, 38219L, 38497L, 39165L,
40077L, 38370L, 37174L), .Dim = c(138L, 1L), .Dimnames = list(
NULL, "Data"), .Tsp = c(2005, 2016.41666666667, 12), class = "ts")


Code

library("forecast")

z <- stl(tsdata[, "Data"], s.window="periodic")

t <- z$time.series[,"trend"]
s <- z$time.series[,"seasonal"]
e <- z$time.series[,"remainder"]

# error matrix
ematrix <- matrix(rnorm(138 * 100, sd = 100), nrow = 138)

# generating a ts class error matrix
ematrixts <- ts(ematrix, start=c(2005,1), freq=12)

# combining the trend + season + error matrix into a real matrix
realmatrix <- t + s + ematrixts

# creating a (forecast) arima matrix
farimamatrix <- matrix(NA, ncol = 100, nrow = 12)

m <- k <- vector("list", length = 100)

for (i in 1:100) {
try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
print(i)
k[[i]] <- forecast.Arima(m[[i]], h = 12)
farimamatrix[,i] <- k[[i]]$mean
}

# ts.plot(farimamatrix[,1:100],col = c(rep("gray",100),rep("red",1)))


The loop seems to work, but breaks down after a few iterations due to failure of
Arima
:


Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, : " non-stationary seasonal AR part from CSS

Answer

Your loop body looks correct, although I would use m <- k <- vector("list", length = 100) instead of m <- k <- list(). But I doubt about fitted(k[[i]]). fitted() is used to extract fitted values rather than your forecast, isn't it? Let's have a check with a small reproducible example:

library(forecast)
fit <- Arima(WWWusage,c(3,1,0))
fore <- forecast(fit, h = 10)

I asked for 10 steps ahead prediction, but

str(fitted(fore))
# Time-Series [1:100] from 1 to 100: 87.9 86.1 81.2 87.1 83 ...

This is the fitted time series (rolling one-step-ahead forecast at observations), not the prediction you want. You should use

fore$mean
#Start = 101 
#End = 110 
#Frequency = 1 
# [1] 219.6608 219.2299 218.2766 217.3484 216.7633 216.3785 216.0062 215.6326
# [9] 215.3175 215.0749

So, replace your fitted(k[[i]]) with k[[i]]$mean.


Let's generate some toy data (similar to the one you described in your comment) to have a test:

set.seed(0)
trend <- 0.1 * (1:138)
seasonal <- rep_len(sin((1:12) * pi / 6), 138)
correlation <- arima.sim(list(ar = 0.5, ma = 0.3), n = 138)
x <- ts(trend + seasonal + correlation, start = c(2005, 1), frequency = 12)
ts.plot(x)

toy time series

We now simply replicate this time series 100 times to resemble your realmatrix:

realmatrix <- structure(replicate(100, x), tsp = tsp(x),
                        class = c("mts", "ts", "matrix"))

Now let's run your loop:

farimamatrix <- matrix(NA, ncol = 100, nrow = 12)

m <- k <- vector("list", length = 100)

for (i in 1:100) {
  try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
  k[[i]] <- forecast.Arima(m[[i]], h = 12)
  farimamatrix[,i] <- k[[i]]$mean
  }

Although it takes some time to run, it returns correct result without error:

str(farimamatrix)
# num [1:12, 1:100] 12.9 12.5 12.3 12.3 12.7 ...
Comments