A.Trzcionkowska A.Trzcionkowska - 3 months ago 9
R Question

Frequency in period of time

It is possible to count frequency in period of time? Example data:

ID=c(1,1,1,1,2,2,2,3,3,3)
Dates <- c("2004-01-01", "2008-10-01", "2001-01-01", "2011-04-01",
"2013-05-01", "2014-08-01", "2009-03-01", "2001-12-01", "2003-04-01", "2011-05-01")
a <- data.frame(ID, Dates)


I would like to achieve something like this:

ID = c(1,2,3)
N = c(4, 3, 3)
Period = a = c("?", "?", "?")
Freq = c(2.5, 1.3, 3.3)
b <- data.frame(ID, z = N, a = Period, y = Freq)


I gues first I need to sort dates descending and count period of time but I have no idea how to do that.

Answer

You can use max and min on dates as long as you make sure that your Dates variable is set as.Date, i.e. a$Dates <- as.Date(a$Dates). As you can imagine subtracting max(Dates) - min(Dates) will give us the range in days. Rounding and dividing by 365 converts those days to years.

libary(dplyr)
a %>% 
  group_by(ID) %>% 
  summarise(N = n(), Period = as.integer(round((max(Dates)-min(Dates))/365)), Freq = Period/N)

# A tibble: 3 × 4
#     ID     N Period     Freq
#  <dbl> <int>  <int>    <dbl>
#1     1     4     10 2.500000
#2     2     3      5 1.666667
#3     3     3      9 3.000000

NOTE: The Freq values don't agree but that could be a rounding error. Functions such as floor or ceiling (and of course round)can be used to adjust the rounding