NBC NBC - 1 month ago 9
R Question

How to merge dataframes when there are duplicates

I want to merge two datasets:

data_a

group | x | y
101 | 1 | test
101 | 1 | one
102 | 7 | two
102 | 3 | three


data_b

group | z |
101 | 1 |
102 | 3 |


I want to merge data_a into data_b when group = group and x = z. However, sometimes there are duplicate occurrences where there are two rows in data_a that get merged. Instead, I'd like to only merge the first occurrence if possible:

data_b

group | z | y
101 | 1 | test
102 | 3 | three

d.b d.b
Answer Source

Using data from G. Grothendieck

data_b$y = data_a$y[match(paste(data_b$group, data_b$z), paste(data_a$group, data_a$x))]
data_b
#  group z     y
#1   101 1  test
#2   102 3 three