baha-kev - 1 year ago 68
R Question

# block bootstrap from subject list

I'm trying to efficiently implement a block bootstrap technique to get the distribution of regression coefficients. The main outline is as follows:

I have a panel data set, say

`firm`
and
`year`
are the indices. For each iteration of the bootstrap, I wish to sample with replacement n subjects. From this sample, I need to construct a new data frame that is an
`rbind()`
stack of all the observations for each sampled subject. With this new data.frame, I can run the regression and pull out the coefficients. Repeat for a bunch of iterations, say 100.

• Each firm can potentially be selected multiple times, so I need to include its data multiple times in each iteration's data set.

• Using a loop and subset approach, like below, seems computationally burdensome.

• My real data frame, n, and # iterations is much larger than the example below.

My thoughts initially are to break the existing total data frame into a list by
`subject`
using the
`split()`
command. From there, use
`sample(unique(df1\$subject),n,replace=TRUE)`
to get the new list, then perhaps implement
`quickdf()`
from the
`plyr`
package to construct a new data frame?

Any thoughts are appreciated!

Example slow code:

``````require(plm)
data("Grunfeld", package="plm")

firms = unique(Grunfeld\$firm)
n = 10
iterations = 100
mybootresults=list()

for(j in 1:iterations){

v = sample(length(firms),n,replace=TRUE)
newdata = NULL

for(i in 1:n){
newdata = rbind(newdata,subset(Grunfeld, firm == v[i]))
}

reg1 = lm(value ~ inv + capital, data = newdata)
mybootresults[[j]] = coefficients(reg1)

}

mybootresults = as.data.frame(t(matrix(unlist(mybootresults),ncol=iterations)))
names(mybootresults) = names(reg1\$coefficients)
mybootresults

(Intercept)      inv    capital
1    373.8591 6.981309 -0.9801547
2    370.6743 6.633642 -1.4526338
3    528.8436 6.960226 -1.1597901
4    331.6979 6.239426 -1.0349230
5    507.7339 8.924227 -2.8661479
...
...
``````

``````myfit <- function(x, i) {