reuss reuss - 2 months ago 14
R Question

R check row for row if a combination exists in another dataframe

I have trying find an answer to this issue with out luck.
I have two data frames (df1 and df2) in R with different information except for two columns, a person nummer (pnr) and a drug name (name). Row for row in df1, I want to check, if the combination of pnr and name exists somewhere in df2. If this combination exists, I want "yes" in a another column in df1. If not a "no".

df1
pnr|drug|...|check
---|----|---|-----
1 | 1 |...| no
1 | 2 |...| yes
2 | 2 |...| yes
3 | 2 |...| no
.....

df2
pnr|drug|...|
---|----|---|
1 | 2 |...|
2 | 2 |...|
....


For example, I want check, if the row combination pnr=1 & drug=1 exists in df2 (no), pnr=1 & drug=2 (yes) etc. And then place a "yes" or "no" in the check column in df1

I have tried the following
for
statement without luck. It does place a "yes or "no" in the "check" column, but it doesn't do it correctly

for(index in 1:nrow(df1)){
if((df1[index,]$pnr %in% df2$pnr)&(df1[index,]$name %in% df2$name)){
check_text="yes"}else{check_text="no"}
df1$check=check_text
}


I have a sense that I should be using
apply
, but I haven't been able to figure that out. Do any of you have an idea, how to solve this?`

Sincerely reussm

Answer

One way is using base R methods.

Pasting the columns pnr and drug together and finding a similar match in df1

df1$check <- ifelse(is.na(match(paste0(df1$pnr, df1$drug), 
                                        paste0(df2$pnr, df2$drug))),"No", "Yes")

#  pnr drug check
#1   1    1    No
#2   1    2   Yes
#3   2    2   Yes
#4   3    2    No
Comments