Balaji Suresh Balaji Suresh - 3 months ago 6
R Question

Extracting Uncommon values from 2 data frames in R

Given two data frames containing dates:

d1
# dates
# 2016-08-01
# 2016-08-02
# 2016-08-03
# 2016-08-04

d2
# dates
# 2016-08-02
# 2016-08-03
# 2016-08-04
# 2016-08-05
# 2016-08-06


How do I create a 3rd dataframe that would have the not-common values?

d3
# dates
# 2016-08-01
# 2016-08-05
# 2016-08-06


Data:

df1 <- structure(list(dates = structure(c(17014, 17015, 17016, 17017 ),
class = "Date")), .Names = "dates", row.names = c(NA, -4L), class =
"data.frame")

df2 <- structure(list(dates = structure(c(17015, 17016, 17017, 17018,
17019), class = "Date")), .Names = "dates", row.names = c(NA, -5L), class
= "data.frame")

Answer

Suppose you have two vectors x and y, elements that are not shared are

c(x[!(x %in% y)], y[!(y %in% x)])

If you work with data frames, provided that your dates column is "character" or "Date" instead of "factor", you can do

rbind(subset(df1, !(df1$dates %in% df2$dates)),
      subset(df2, !(df2$dates %in% df1$dates)))

Simple vector example

x <- 1:5
y <- 3:8
c(x[!(x %in% y)], y[!(y %in% x)])
# [1] 1 2 6 7 8

Vector of "Date"

x <- seq(from = as.Date("2016-01-01"), length = 5, by = 1)
y <- seq(from = as.Date("2016-01-03"), length = 5, by = 1)
c(x[!(x %in% y)], y[!(y %in% x)])
# [1] "2016-01-01" "2016-01-02" "2016-01-06" "2016-01-07"

Example data frame in your question

rbind(subset(df1, !(df1$dates %in% df2$dates)),
      subset(df2, !(df2$dates %in% df1$dates)))

#       dates
#1 2016-08-01
#4 2016-08-05
#5 2016-08-06