gibbz00 gibbz00 - 3 months ago 10
R Question

Updating data.table by inserting new rows that are different from old rows

I have two data.table(dt1 & dt2). dt1 is past product data and dt2 is present product data. I want to create a third data.table that inserts new rows from dt2 into dt1 only when product characteristics(Level or Color) are different or Product itself is different.

library(data.table)
dt1 <- fread('
Product Level Color ReviewDate
A 0 Blue 9/7/2016
B 1 Red 9/7/2016
C 1 Purple 9/7/2016
D 2 Blue 9/7/2016
E 1 Green 9/7/2016
F 4 Yellow 9/7/2016
')
dt2 <- fread('
Product Level Color ReviewDate
A 1 Black 9/8/2016
B 1 Red 9/8/2016
C 5 White 9/8/2016
D 2 Blue 9/8/2016
E 1 Green 9/8/2016
F 4 Yellow 9/8/2016
G 3 Orange 9/8/2016
')


My final data.table(dt3) should have the following changes:A and C are both different in dt2 than dt1, thats why the new(different) rows from dt2 gets inserted into the final table alongside all rows from dt1. G is a totally new product that was not in dt1, thats why it makes it into the final table.

Product Level Color ReviewDate
A 0 Blue 9/7/2016
A 1 Black 9/8/2016
B 1 Red 9/7/2016
C 1 Purple 9/7/2016
C 5 White 9/8/2016
D 2 Blue 9/7/2016
E 1 Green 9/7/2016
F 4 Yellow 9/7/2016
G 3 Orange 9/8/2016


I have tried:

setkey(dt1, Product)
setkey(dt2, Product)
dt3<- dt1[dt2]
setkey(dt3,Product,ReviewDate)

Answer

You can stack and uniqify:

unique(rbind(dt1, dt2), by=c("Product", "Level", "Color"))