giacomoV - 7 months ago 76

R Question

I have an issue understanding how to use the

`dplyr`

`bootstrap`

What I want is to generate a bootstrap distribution from two

`library(dplyr)`

library(broom)

data(mtcars)

mtcars %>%

mutate(treat = sample(c(0, 1), 32, replace = T)) %>%

group_by(treat) %>%

summarise(m = mean(disp)) %>%

summarise(m = m[treat == 1] - m[treat == 0])

The issue is that I need to repeat this operation

`100`

`1000`

Using

`replicate`

`frep = function(mtcars) mtcars %>%`

mutate(treat = sample(c(0, 1), 32, replace = T)) %>%

group_by(treat) %>%

summarise(m = mean(disp)) %>%

summarise(m = m[treat == 1] - m[treat == 0])

replicate(1000, frep(mtcars = mtcars), simplify = T) %>% unlist()

and get the distribution

I don't really get how to use

`bootstrap`

`mtcars %>%`

bootstrap(10) %>%

mutate(treat = sample(c(0, 1), 32, replace = T))

mtcars %>%

bootstrap(10) %>%

do(tidy(treat = sample(c(0, 1), 32, replace = T)))

It's not really working. Where should I put the

`bootstrap`

Thanks.

Answer

In the `do`

step, we wrap with `data.frame`

and create the 'treat' column, then we can group by 'replicate' and 'treat' to get the `summarise`

d output column

```
mtcars %>%
bootstrap(10) %>%
do(data.frame(., treat = sample(c(0,1), 32, replace=TRUE))) %>%
group_by(replicate, treat) %>%
summarise(m = mean(disp)) %>%
summarise(m = m[treat == 1] - m[treat == 0])
#or as 1 occurs second and 0 second, we can also use
#summarise(m = last(m) - first(m))
```