Bonono Bonono - 1 month ago 7
R Question

Select non NA values in match when matching columns

I have two

data.frame
s,
aa
and
bb
:

aa <- data.frame(ID = c(rep(letters[1:4]),"d","d", "e","e"), Name = c("Dave", "Charlie", "Ella", NA,"Timothy","James",NA, "Bob"))
aa
ID Name
1 a Dave
2 b Charlie
3 c Ella
4 d <NA>
5 d Timothy
6 d James
7 e <NA>
8 e Bob

bb <- data.frame(ID = rep(letters[1:5]),Name = 0)

bb
ID Name
1 a 0
2 b 0
3 c 0
4 d 0


I want to fill
bb$Name
by matching
bb$ID
with
aa$ID
so that I get the earliest name in the alphabet (in this case
James
):

ID Name
1 a Dave
2 b Charlie
3 c Ella
4 d James
5 e Bob


However, if I try
bb$Name <- aa[match(bb$ID,aa$ID),2]
I get:

bb

ID Name
1 a Dave
2 b Charlie
3 c Ella
4 d <NA>
5 e <NA>


Do you think some if statement code along the lines of: if there are more than 1 of the same values in
aa$ID
, and if one of those values are
NA
, then select the one that is of earlier alphabetical order to match with. So the
d
value in bb$Name would be
James


I understand I could subset the original data but I don't want to do that in my actual data as the information that
aa$Name
has an
NA
value is important

Answer

Is it ok to create a temporary copy of aa?

tmp <- aa[!is.na(aa$Name),]
tmp <- tmp[order(tmp$Name),]
bb$Name <- tmp[match(bb$ID,tmp$ID),2]
bb

This gets you the result you wanted.