JaimeCr - 1 year ago 115
R Question

# data.table reference column in another data.table by name

I have the following data.tables:

``````Comparison <- data.table(code = c("AAA", "BBB"),
elem1 = c(1, 2),
elem2 = c(4, 4))

DT <- data.table(A = c("AAA", "AAA", "AAA", "AAA"),
B = c("BBB", "BBB", "BBB", "BBB"),
C = c(1, 2, 3, 4))
``````

Now, I would want to add a new column based on a comparison of a column from
`Comparison`
and from
`DT`
. The following command generates the expected output:

``````DT[, newCol := {ifelse( abs(C - Comparison[code == "AAA", elem2]) == 0, "0", "1")}]

Output:

A   B C newCol
1: AAA BBB 1      1
2: AAA BBB 2      1
3: AAA BBB 3      1
4: AAA BBB 4      0
``````

However, if instead of hardcoding the column value of the column
`A`
, I use the column itself with this:

``````DT[, newCol := {ifelse( abs(C - Comparison[code == A, elem2]) > 0, "0", "1")}]
``````

It outputs the following error, which I'm not sure how to avoid:

``````Error in `[.data.table`(Comparison, code == A, elem2) :
RHS of == is length 4 which is not 1 or nrow (2). For robustness, no recycling is allowed (other than of length 1 RHS). Consider %in% instead.
``````

It seems to me that the operation is not vectorizing the elements of the column
`A`
of
`DT`
in
`Comparison`
and I don't really understand why, because the elements of the column
`C`
are taken properly (i.e. it uses the elements of
`C`
individually but not the elements of
`A`
). How could I perform this comparison?

Any help would be greatly appreciated.

We can use `join` with `on`
``````DT[Comparison, newCol := as.integer(C != elem2), on = c("A" = "code"), nomatch = 0]