user6571411 - 1 year ago 90
R Question

# Find subset of individuals who have more than 3 observations over 6 months

I have a dataframe where each row is an observation of an event. There are two columns,

`id`
and
`date`
. I want to make a third column that identifies those individuals (based on
`id`
) who have 3 or more events over any 6 month period (based on
`date`
). However, an event can only count as unique if it further than 7 days away from a previous event. Having a third column is not necessary if users can think of another way of accomplishing this.

``````id <- c(1,1,1,2,2,2,3,3,3,4,4)
date <- as.Date(c("2015-01-01", "2015-03-02", "2015-03-05", "2015-01-13", "2015-01-29", "2015-12-15", "2015-01-03", "2015-03-03", "2015-04-03", "2015-01-29", "2015-03-04"),format = "%Y-%m-%d")
df <- data.frame(id, date)
``````

In the dummy code above the method should identify individual
`id == 3`
as having the needed number og observations over the correct interval of time while excluding
`id == 1`
because observations at date
`"2015-03-02"`
and
`"2015-03-05"`
are within 7 days of each other and
`id == 2`
and
`id == 4`
because they have <3 observations over 6 months.

May be this helps

``````library(data.table)
setDT(df)[, ind :=  if(all(diff(date) > 7) & all(diff(date) < 60) & .N >2) TRUE
else FALSE , id][]
#    id       date   ind
# 1:  1 2015-01-01 FALSE
# 2:  1 2015-03-02 FALSE
# 3:  1 2015-03-05 FALSE
# 4:  2 2015-01-13 FALSE
# 5:  2 2015-01-29 FALSE
# 6:  2 2015-12-15 FALSE
# 7:  3 2015-01-03  TRUE
# 8:  3 2015-03-03  TRUE
# 9:  3 2015-04-03  TRUE
#10:  4 2015-01-29 FALSE
#11:  4 2015-03-04 FALSE
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download