baha-kev baha-kev - 3 days ago 6
R Question

block bootstrap from subject list

I'm trying to efficiently implement a block bootstrap technique to get the distribution of regression coefficients. The main outline is as follows:

I have a panel data set, say

firm
and
year
are the indices. For each iteration of the bootstrap, I wish to sample with replacement n subjects. From this sample, I need to construct a new data frame that is an
rbind()
stack of all the observations for each sampled subject. With this new data.frame, I can run the regression and pull out the coefficients. Repeat for a bunch of iterations, say 100.


  • Each firm can potentially be selected multiple times, so I need to include its data multiple times in each iteration's data set.

  • Using a loop and subset approach, like below, seems computationally burdensome.

  • My real data frame, n, and # iterations is much larger than the example below.



My thoughts initially are to break the existing total data frame into a list by
subject
using the
split()
command. From there, use
sample(unique(df1$subject),n,replace=TRUE)
to get the new list, then perhaps implement
quickdf()
from the
plyr
package to construct a new data frame?

Any thoughts are appreciated!

Example slow code:

require(plm)
data("Grunfeld", package="plm")

firms = unique(Grunfeld$firm)
n = 10
iterations = 100
mybootresults=list()

for(j in 1:iterations){

v = sample(length(firms),n,replace=TRUE)
newdata = NULL

for(i in 1:n){
newdata = rbind(newdata,subset(Grunfeld, firm == v[i]))
}

reg1 = lm(value ~ inv + capital, data = newdata)
mybootresults[[j]] = coefficients(reg1)

}

mybootresults = as.data.frame(t(matrix(unlist(mybootresults),ncol=iterations)))
names(mybootresults) = names(reg1$coefficients)
mybootresults

(Intercept) inv capital
1 373.8591 6.981309 -0.9801547
2 370.6743 6.633642 -1.4526338
3 528.8436 6.960226 -1.1597901
4 331.6979 6.239426 -1.0349230
5 507.7339 8.924227 -2.8661479
...
...

Answer

How about something like this:

myfit <- function(x, i) {
   mydata <- do.call("rbind", lapply(i, function(n) subset(Grunfeld, firm==x[n])))
   coefficients(lm(value ~ inv + capital, data = mydata))
}

firms <- unique(Grunfeld$firm)

b0 <- boot(firms, myfit, 999)
Comments