Joehax - 1 year ago 71
R Question

# Assigning a subset of a data.table to a subset of another data.table

I have two data.tables:

``````library(data.table)
dt1 <- data.table(A=1:10, B=1, C=2, D=3)
dt2 <- data.table(A2=5:9, B2=4, C2=5, D2=6)
``````

I want to assign elements of column B2 and C2 in dt2 to column B and C in dt1 where A = A2, such that I get a table that looks like this:

``````dt1
#      A B C D
#  1:  1 1 2 3
#  2:  2 1 2 3
#  3:  3 1 2 3
#  4:  4 1 2 3
#  5:  5 4 5 3
#  6:  6 4 5 3
#  7:  7 4 5 3
#  8:  8 4 5 3
#  9:  9 4 5 3
# 10: 10 1 2 3
``````

I know I can assign one column at a time:

``````id1 <- which(dt1\$A %in% dt2\$A2)
id2 <- which(dt2\$A2 %in% dt1\$A)
dt1\$B[id1] <- dt2\$B2[id2]
dt1\$C[id1] <- dt2\$C2[id2]
``````

however this seems like a lot of lines of code to do only very little, especially if I had many columns.

I was thinking something like

``````# Not working:
dt1[id1][,list(B,C)] <- dt2[id2][,list(B2,C2)]
``````

would work, but I get an error message.

Is there a smarter, nicer way to do it?

We can do a join `on` the 'A' with 'A2' and assign (`:=`) the values of columns of interest to 'C' and 'D'

``````library(data.table)#v1.9.7+
dt1[dt2,  c('C', 'D') := .(C2, D2),  on = .(A= A2)]
dt1
#     A B C D
# 1:  1 1 2 3
# 2:  2 1 2 3
# 3:  3 1 2 3
# 4:  4 1 2 3
# 5:  5 1 5 6
# 6:  6 1 5 6
# 7:  7 1 5 6
# 8:  8 1 5 6
# 9:  9 1 5 6
#10: 10 1 2 3
``````

If there are many columns that we need to change, instead of typing out the columns, we get the `values` using `mget` and assign it to the corresponding column names in 'dt1'

``````dt1[dt2, names(dt1)[3:4] :=  mget(names(dt2)[3:4]), on = .(A = A2)]
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download