Woody Woody - 27 days ago 8
R Question

replace NA value in unmatched rows after joining data.table in R

I have two data.table.

dt1 <- data.table(id=seq(1, 4), col1=1:4, col2=1:4, col3=1:4)
dt2 <- data.table(id=seq(1, 5), val=c("a", "b", "c", "d", "e"))


I want to do a right join like

dt <- merge(dt1, dt2, all.y=TRUE, by="id")


but replace NAs in the results with 0.
The result should be

id col1 col2 col3 val
1 1 1 1 a
2 2 2 2 b
3 3 3 3 c
4 4 4 4 d
5 0 0 0 e


I have a lot of columns in the data.table so I don't want to change them one by one. Is there any efficient way to do this?

Answer Source

Instead of creating a new table, you might consider just adding the cols from dt1 to dt2:

cols = setdiff(names(dt1), "id")
dt2[dt1, on="id", (cols) := mget(paste0("i.", cols))]
dt2[!dt1, on="id", (cols) := 0 ]

   id val col1 col2 col3
1:  1   a    1    1    1
2:  2   b    2    2    2
3:  3   c    3    3    3
4:  4   d    4    4    4
5:  5   e    0    0    0

If you really want a new table, though, you can do

cols = setdiff(names(dt1), "id")
z = dt1[dt2, on="id"]
z[!dt1, on="id", (cols) := 0 ]

   id col1 col2 col3 val
1:  1    1    1    1   a
2:  2    2    2    2   b
3:  3    3    3    3   c
4:  4    4    4    4   d
5:  5    0    0    0   e

The package maintainers plan to write a vignette on joins.