Joshua Rosenberg Joshua Rosenberg - 1 month ago 12
R Question

Count observations by day of year using lubridate in R

I am trying to count the number of observations by the day of the year. Here are six observations:

six_obs <- data.frame(Date = c("2015-09-06 00:00:12 UTC", "2015-09-06 00:01:47 UTC", "2015-09-06 00:03:30 UTC", "2015-10-06 00:03:31 UTC", "2015-10-06 00:03:36 UTC", "2015-10-06 00:06:18 UTC"), Count = c(6, 4, 5, 4, 5, 7), stringsAsFactors = F)


In order to group them by day of the year, I can do something like the following:

library(dplyr)
library(lubridate)

six_obs %>%
mutate(Date = ymd_hms(Date),
day_of_year = yday(Date)) %>%
group_by(day_of_year) %>%
summarize(number_of_obs = n())


This works fine, but if I have very many dates over multiple years, then this will not straightforwardly work, because the
lubridate
function
yday
returns an integer between
1
and
365
.

Is there a way to group by the day of the year? One solution is to use the
lubridate
functions
yday
and
year
and then to
paste
yday
and
year
together, but it seems like there might be a more elegant solution.

Answer

dplyr::count is equivalent to group_by(...) %>% summarise(n = n()), so you really only need

six_obs %>% count(day_of_year = date(Date))

## # A tibble: 2 × 2
##   day_of_year     n
##        <date> <int>
## 1  2015-09-06     3
## 2  2015-10-06     3

where lubridate::date simply converts (or parses, if the Date column is character) to Date class, mostly equivalent to as.Date.