T-T T-T - 1 year ago 39
R Question

Order by month and year in R data frame

I have cleaned and ordered my data by date, which looks like below:

df1 <- data.frame(matrix(vector(),ncol=4, nrow = 3))
colnames(df1) <- c("Date","A","B","C")
df1[1,] <- c("2000-01-30","0","1","0")
df1[2,] <- c("2000-01-31","2","0","3")
df1[3,] <- c("2000-02-29","1","2","1")
df1[4,] <- c("2000-03-31","2","1","3")
df1
Date A B C
1 2000-01-30 0 1 0
2 2000-01-31 2 0 3
3 2000-02-29 1 2 1
4 2000-03-31 2 1 3


However, I want to drop the day and order the data by month and year so the data will look like:

Date A B C
1 2000-01 2 1 3
3 2000-02 1 2 1
4 2000-03 2 1 3


I tried to use
as.yearmon
from
zoo
df2 <- as.yearmon(df1$Date, "%b-%y")
and it returns
NA
. Thank you in advance for your generous help!

Answer Source

Here's a way to get the sum of the values for each column within each combination of Year-Month:

library(zoo)
library(dplyr)

# Convert non-date columns to numeric
df1[,-1] = lapply(df1[,-1], as.numeric)

df1 %>% mutate(Date = as.yearmon(Date)) %>%
  group_by(Date) %>%
  summarise_each(funs(sum))

Or, even shorter:

df1 %>% 
  group_by(Date=as.yearmon(Date)) %>%
  summarise_each(funs(sum))
           Date     A     B     C
1      Jan 2000     2     1     3
2      Feb 2000     1     2     1
3      Mar 2000     2     1     3

A couple of additional enhancements:

  1. Add the number of rows for each group:

    df1 %>% group_by(Date=as.yearmon(Date)) %>%
      summarise_each(funs(sum)) %>%
      bind_cols(df1 %>% count(d=as.yearmon(Date)) %>% select(-d))
    
  2. Multiple summary functions:

    df1 %>% group_by(Date=as.yearmon(Date)) %>%
      summarise_each(funs(sum(.), mean(.))) %>%
      bind_cols(df1 %>% count(d=as.yearmon(Date)) %>% select(-d))
    
           Date A_sum B_sum C_sum A_mean B_mean C_mean     n
1      Jan 2000     2     1     3      1    0.5    1.5     2
2      Feb 2000     1     2     1      1    2.0    1.0     1
3      Mar 2000     2     1     3      2    1.0    3.0     1