ErrantBard ErrantBard - 5 months ago 19
R Question

R data.table - simple way to join on all columns without specifying column names

I have two tables A1 and A2 that I would like to join on all columns (purpose is to have a function that can be used to track changes to the data). Is there a easy way to join on all columns without specifying them explicitly.

Something like

A1[A2, on="ALL"]

If anyone needs some code example I can provide it, but the question is more generally about if I have missed something in the help section for data.table.

I'm looking for a data.table solution. If there is none I'll use dplyr's:

inner_join(A1, A2, by = NULL)


I think that the way to do this in the data table is the following:

dt1 <- data.table(A1 = c(1,2,3), A2 = c("A", "B", "D"))
dt2 <- data.table(A1 = c(3,2,3), A2 = c("A", "B", "C"))

setkeyv(dt1, names(dt1))
setkeyv(dt2, names(dt2))

and the inner join on all common columns is:

dt1[dt2, nomatch = 0]

Other options include the following (credits to Frank in the comments):

dt1[dt2, on=names(dt2), nomatch = 0]

This has the benefit of not requiring to key the data table. (More info can be found here: What is the purpose of setting a key in data.table? )

Another option using the data sets operations (available in version 1.9.7 or later):

fintersect(dt1, dt2)