A.Trzcionkowska A.Trzcionkowska - 1 year ago 44
R Question

Frequency in period of time

It is possible to count frequency in period of time? Example data:

ID=c(1,1,1,1,2,2,2,3,3,3)
Dates <- c("2004-01-01", "2008-10-01", "2001-01-01", "2011-04-01",
"2013-05-01", "2014-08-01", "2009-03-01", "2001-12-01", "2003-04-01", "2011-05-01")
a <- data.frame(ID, Dates)


I would like to achieve something like this:

ID = c(1,2,3)
N = c(4, 3, 3)
Period = a = c("?", "?", "?")
Freq = c(2.5, 1.3, 3.3)
b <- data.frame(ID, z = N, a = Period, y = Freq)


I gues first I need to sort dates descending and count period of time but I have no idea how to do that.

Answer Source

You can use max and min on dates as long as you make sure that your Dates variable is set as.Date, i.e. a$Dates <- as.Date(a$Dates). As you can imagine subtracting max(Dates) - min(Dates) will give us the range in days. Rounding and dividing by 365 converts those days to years.

libary(dplyr)
a %>% 
  group_by(ID) %>% 
  summarise(N = n(), Period = as.integer(round((max(Dates)-min(Dates))/365)), Freq = Period/N)

# A tibble: 3 × 4
#     ID     N Period     Freq
#  <dbl> <int>  <int>    <dbl>
#1     1     4     10 2.500000
#2     2     3      5 1.666667
#3     3     3      9 3.000000

NOTE: The Freq values don't agree but that could be a rounding error. Functions such as floor or ceiling (and of course round)can be used to adjust the rounding