Kenneth Chen Kenneth Chen - 2 months ago 15
R Question

Applying `ar` (autoregressive model) for my data frame using `lapply` returns `numeric(0)`?

I'm working with a data.frame with all numeric data. I want to calculate the first order autoregressive coefficients for each column. I chose apply function to do the task and I defined a function as the following:

return.ar <- function(vec){
return(as.numeric(ar(vec)$ar))
}


Then I applied it to a data frame I subset by column names as the following

lapply(df_return[,col.names],return.ar)


I was expecting to get a vector with ar coefficients. But instead I got a list with all the coefficients put in the first element like the following

$C.Growth
[1] 0.35629140 -0.07671252 -0.08699333 -0.27404355 0.21448342
[6] -0.19049197 0.06610908 -0.23077602

$Mkt.ret
numeric(0)

$SL
numeric(0)

$SM
numeric(0)

$SH
numeric(0)

$LL
numeric(0)

$LM
numeric(0)

$LH
numeric(0)


I don't understand what's going on.

The output of
dput(head(df_return))
looks like the following:

structure(list(Year = c(1929, 1930, 1931, 1932, 1933, 1934),
C.Growth = c(0.94774902516838, 0.989078396169958, 0.911586749357132,
0.996183522774413, 1.08170234030149, 1.05797659377887), S.Return = c(-19.7068321696574,
-31.0834309393085, -45.2864376593084, -9.42504715968666,
57.0992131145999, 4.05781718258972), Rf = c(4.79316783034255,
2.58656906069154, 1.24356234069162, 0.954952840313344, 0.199213114599945,
0.147817182589718), Inflation = c(-0.0531678303425544, -0.15656906069154,
-0.15356234069162, -0.00495284031334435, 0.100786885400055,
0.0321828174102824), Mkt.ret = c(-14.9668321696574, -28.6534309393085,
-44.1964376593084, -8.47504715968666, 57.3992131145999, 4.23781718258972
), SL = c(-45.2568321696575, -35.1134309393085, -41.1864376593084,
-5.28504715968666, 166.0392131146, 34.1378171825897), SM = c(-30.7368321696574,
-31.9034309393085, -48.5364376593084, -8.94504715968666,
118.7092131146, 19.7578171825897), SH = c(-36.7568321696575,
-45.1834309393085, -51.5364376593084, 2.78495284031334, 125.7792131146,
7.95781718258972), LL = c(-19.6968321696574, -26.2734309393085,
-36.2264376593084, -7.31504715968666, 44.1492131145999, 10.6978171825897
), LM = c(0.673167830342554, -29.2434309393085, -59.9864376593084,
-16.7150471596867, 89.4692131145999, -2.93218281741028),
LH = c(-4.35683216965745, -43.1934309393085, -57.7364376593084,
-4.30504715968666, 114.7092131146, -21.8421828174103)), .Names = c("Year",
"C.Growth", "S.Return", "Rf", "Inflation", "Mkt.ret", "SL", "SM",
"SH", "LL", "LM", "LH"), row.names = c(NA, 6L), class = "data.frame")

Answer

Once you include your data, diagnose becomes easy.

ar will do auto-section of p based on AIC. Some of your columns have strong evidence to be white noise, hence ar has selected p = 0, in which case $ar field will be numeric(0).

I suggest you also use the following:

lapply(df_return[col.names], function (x) ar(x, order.max = 5)$order)

or even better:

fit_ar <- function(x) ar(x, order.max = 5)[c("order", "ar")]
lapply(df_return[col.names], fit_ar)

The latter returns both p as well as AR coefficients for each column. I have set order.max = 5, so that ar won't choose it itself.


You tried to convince me that lapply is doing wrong, by using this for loop:

ar.vec <- numeric()
for (name in col.names)
   ar.vec <- c(ar.vec, return.ar(df_return[[ name ]]))

But unfortunately you won't get anything useful from this. Note you used concatenation c(), thus there is no way to tell which coefficient is for which column.

lapply is not identical to such loop. You should use:

ar.vec <- vector("list", length(col.names))
for (i in 1:length(col.names))
   ar.vec[[i]] <- return.ar(df_return[[ col.names[i] ]])
Comments