Jesster Jesster - 5 months ago 35
R Question

Count people present within specified date range

I have one df containing individuals' arrival & departure dates and their total length of stay (los):

arrive <- as.Date(c("2016/08/01","2016/08/03","2016/08/03","2016/08/04"))
depart <- as.Date(c("2016/08/02","2016/08/07","2016/08/04", "2016/08/06"))
people <- data.frame(arrive, depart)
people$los <- people$depart - people$arrive

...and another df containing start & end dates.

start <-seq(from=as.Date("2016/08/01"), to=as.Date("2016/08/08"), by="days")
end <-seq(from=as.Date("2016/08/01"), to=as.Date("2016/08/08"), by="days")
range <- data.frame(start, end)

How can I add a column range$census to count how many people were present each day? For my example, the values I'm looking for would be as follows:

range$census <- c(1,1,2,3,2,2,1,0)

What I am not sure of is how to apply a calculation on values from one df to another df of a different length. Here's what I've tried so far:

people$count <- 1
range$census <- sum(people$count[people$arrival <= range$start & people$depart >= range$end])

Note: in example above the start/end dates are the same day, but I will also need to look at larger ranges, where the start/end dates will be a month or a year apart.


Why do you need the 'end' column in range?

This will work-

range$count <- rep(0, nrow(range))
sapply(seq(nrow(people)), function(x) 
        range$count <<- range$count + range$start %in%
                        seq(people[x, "arrive"], people[x, "depart"], by = "day")