Kenneth Chen - 11 months ago 72

R Question

I'm working with a data.frame with all numeric data. I want to calculate the first order autoregressive coefficients for each column. I chose apply function to do the task and I defined a function as the following:

`return.ar <- function(vec){`

return(as.numeric(ar(vec)$ar))

}

Then I applied it to a data frame I subset by column names as the following

`lapply(df_return[,col.names],return.ar)`

I was expecting to get a vector with ar coefficients. But instead I got a list with all the coefficients put in the first element like the following

`$C.Growth`

[1] 0.35629140 -0.07671252 -0.08699333 -0.27404355 0.21448342

[6] -0.19049197 0.06610908 -0.23077602

$Mkt.ret

numeric(0)

$SL

numeric(0)

$SM

numeric(0)

$SH

numeric(0)

$LL

numeric(0)

$LM

numeric(0)

$LH

numeric(0)

I don't understand what's going on.

The output of

`dput(head(df_return))`

`structure(list(Year = c(1929, 1930, 1931, 1932, 1933, 1934),`

C.Growth = c(0.94774902516838, 0.989078396169958, 0.911586749357132,

0.996183522774413, 1.08170234030149, 1.05797659377887), S.Return = c(-19.7068321696574,

-31.0834309393085, -45.2864376593084, -9.42504715968666,

57.0992131145999, 4.05781718258972), Rf = c(4.79316783034255,

2.58656906069154, 1.24356234069162, 0.954952840313344, 0.199213114599945,

0.147817182589718), Inflation = c(-0.0531678303425544, -0.15656906069154,

-0.15356234069162, -0.00495284031334435, 0.100786885400055,

0.0321828174102824), Mkt.ret = c(-14.9668321696574, -28.6534309393085,

-44.1964376593084, -8.47504715968666, 57.3992131145999, 4.23781718258972

), SL = c(-45.2568321696575, -35.1134309393085, -41.1864376593084,

-5.28504715968666, 166.0392131146, 34.1378171825897), SM = c(-30.7368321696574,

-31.9034309393085, -48.5364376593084, -8.94504715968666,

118.7092131146, 19.7578171825897), SH = c(-36.7568321696575,

-45.1834309393085, -51.5364376593084, 2.78495284031334, 125.7792131146,

7.95781718258972), LL = c(-19.6968321696574, -26.2734309393085,

-36.2264376593084, -7.31504715968666, 44.1492131145999, 10.6978171825897

), LM = c(0.673167830342554, -29.2434309393085, -59.9864376593084,

-16.7150471596867, 89.4692131145999, -2.93218281741028),

LH = c(-4.35683216965745, -43.1934309393085, -57.7364376593084,

-4.30504715968666, 114.7092131146, -21.8421828174103)), .Names = c("Year",

"C.Growth", "S.Return", "Rf", "Inflation", "Mkt.ret", "SL", "SM",

"SH", "LL", "LM", "LH"), row.names = c(NA, 6L), class = "data.frame")

Answer Source

Once you include your data, diagnose becomes easy.

`ar`

will do auto-section of `p`

based on AIC. Some of your columns have strong evidence to be white noise, hence `ar`

has selected `p = 0`

, in which case `$ar`

field will be `numeric(0)`

.

I suggest you also use the following:

```
lapply(df_return[col.names], function (x) ar(x, order.max = 5)$order)
```

or even better:

```
fit_ar <- function(x) ar(x, order.max = 5)[c("order", "ar")]
lapply(df_return[col.names], fit_ar)
```

The latter returns both `p`

as well as AR coefficients for each column. I have set `order.max = 5`

, so that `ar`

won't choose it itself.

You tried to convince me that `lapply`

is doing wrong, by using this `for`

loop:

```
ar.vec <- numeric()
for (name in col.names)
ar.vec <- c(ar.vec, return.ar(df_return[[ name ]]))
```

But unfortunately you **won't get anything useful** from this. Note you used concatenation `c()`

, thus there is no way to tell which coefficient is for which column.

`lapply`

is not identical to such loop. You should use:

```
ar.vec <- vector("list", length(col.names))
for (i in 1:length(col.names))
ar.vec[[i]] <- return.ar(df_return[[ col.names[i] ]])
```