jebyrnes - 1 year ago 88
R Question

# Simulating a timeseries in dplyr instead of using a for loop

So, while

`lag`
and
`lead`
in dplyr are great, I want to simulate a timeseries of something like population growth. My old school code would look something like:

``````tdf <- data.frame(time=1:5, pop=50)
for(i in 2:5){
tdf\$pop[i] = 1.1*tdf\$pop[i-1]
}
``````

which produces

``````  time    pop
1    1 50.000
2    2 55.000
3    3 60.500
4    4 66.550
5    5 73.205
``````

I feel like there has to be a
`dplyr`
or
`tidyverse`
way to do this (as much as I love my for loop).

But, something like

``````tdf <- data.frame(time=1:5, pop=50) %>%
mutate(pop = 1.1*lag(pop))
``````

which would have been my first guess just produces

``````  time pop
1    1  NA
2    2  55
3    3  55
4    4  55
5    5  55
``````

I feel like I'm missing something obvious.... what is it?

Note - this is a trivial example - my real examples use multiple parameters, many of which are time-varying (I'm simulating forecasts under different GCM scenarios), so, the tidyverse is proving to be a powerful tool in bringing my simulations together.

`Reduce` (or its purrr variants, if you like) is what you want for cumulative functions that don't already have a `cum*` version written:

``````data.frame(time = 1:5, pop = 50) %>%
mutate(pop = Reduce(function(x, y){x * 1.1}, pop, accumulate = TRUE))

##   time    pop
## 1    1 50.000
## 2    2 55.000
## 3    3 60.500
## 4    4 66.550
## 5    5 73.205
``````

or with purrr,

``````data.frame(time = 1:5, pop = 50) %>%
mutate(pop = accumulate(pop, ~.x * 1.1))

##   time    pop
## 1    1 50.000
## 2    2 55.000
## 3    3 60.500
## 4    4 66.550
## 5    5 73.205
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download