zy.G - 3 years ago 141
R Question

# ARIMA model with nonlinear exogenous variable in R

I'm doing a non-linear regression in R and want to add one moving-average term to my model to eliminate the autocorrelations in residuals.

Basically, here is the model:

``````y[n] = a + log((x1[n])^g + (x2[n])^g) + c*e[n-1] + e[n]
``````

where
`[e]`
is the moving average term.

I plan to use
`ARIMA(0, 0, 1)`
to model residuals. However, I do not know which function I should use in R to add non-linear exogenous part to ARIMA model.

`nls`
command to estimate
`a`
and
`g`
, but do not know how to deal with
`e[n]`
.

I know that
`xreg`
in
`arima`
can handle ARIMA model with linear exogenous variables. Is there a similar function to handle ARIMA model with nonlinear exogenous variables?

Thank you for the help in advance!

`nlme` has such capability, as it is fitting non-linear mixed models. You can think of it an extension to `nls` (a fixed-effect only non-linear regression), by allowing random effect and correlated errors.

`nlme` can handle ARMA correlation, by something like `correlation = corARMA(0.2, ~ 1, p = 0, q = 1, fixed = TRUE)`. This means, that residuals are MA(1) process, with initial guess of coefficient 0.2, but to be updated during model fitting. The `~ 1` suggests that `MA(1)` is on intercept and there is no further grouping structure.

I am not an expert in `nlme`, but I know `nlme` is what you need. I produce the following example, but since I am not an expert, I can't get `nlme` work at the moment. I post it here to give a start / flavour.

``````set.seed(0)
x1 <- runif(100)
x2 <- runif(100)
## MA(1) correlated error, with innovation standard deviation 0.1
e <- arima.sim(model = list(ma = 0.5), n = 100, sd = 0.1)
## a true model, with `a = 0.2, g = 0.5`
y0 <- 0.2 + log(x1 ^ 0.5 + x2 ^ 0.5)
## observations
y <- y0 + e

## no need to install; it comes with R; just `library()` it
library(nlme)

fit <- nlme(y ~ a + log(x1 ^ g + x2 ^ g), fixed = a + g ~ 1,
start = list(a = 0.5, g = 1),
correlation = corARMA(0.2, form = ~ 1, p = 0, q = 1, fixed = FALSE))
``````

Similar to `nls`, we have an overall model formula `y ~ a + log(x1 ^ g + x2 ^ g)`, and starting values are required for iteration process. I have chosen `start = list(a = 0.5, g = 1)`. The `correlation` bit has been explained in the beginning.

`fixed` and `random` arguments in `nlme` specify what should be seen as fixed effects and random effects in the overall formula. Since we have no random effect, we leave it unspecified. We want `a` and `g` as fixed effect, so I tried something like `fixed = a + g ~ 1`. Unfortunately it does not quite work, for some reason I don't know. I read the `?nlme`, and thought this formula means that we want a common `a` and `g` for all observations, but later `nlme` reports an error saying this is not a valid group formula.

I am also investing at this; as I said, the above gives us a start. We are already fairly close to the final answer.

Thanks to user20650 for point out my awkward error. I should use `gnls` function rather than `nlme`. By design nature of `nlme` package, functions `lme` and `nlme` have to take a `random` argument to work. Luckily, there are several other routines in `nlme` package for extending linear models and non-linear models.

• `gls` and `gnls` extend `lm` and `nls` by allowing non-diagonal variance functions.

So, I should really use `gnls` instead:

``````## no `fixed` argument as `gnls` is a fixed-effect only
fit <- gnls(y ~ a + log(x1 ^ g + x2 ^ g), start = list(a = 0.5, g = 1),
correlation = corARMA(0.2, form = ~ 1, p = 0, q = 1, fixed = FALSE))

#Generalized nonlinear least squares fit
#  Model: y ~ a + log(x1^g + x2^g)
#  Data: NULL
#  Log-likelihood: 92.44078
#
#Coefficients:
#        a         g
#0.1915396 0.5007640
#
#Correlation Structure: ARMA(0,1)
# Formula: ~1
# Parameter estimate(s):
#   Theta1
#0.4184961
#Degrees of freedom: 100 total; 98 residual
#Residual standard error: 0.1050295
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download