Rui Rui - 1 month ago 7
R Question

Match values in 2 dataframes, NA error

It is necessary to use the data presented here, for the sake of the problem.
I would like to match values from 2 dataframes. however some values are not "matched", and I cannot see why!
I will try to concisely explain my problem.

1) dataframe with theoretical values



#1.1) I have the following vector
Pos<-c(8.75, 9.3, 8.8, 9.6, 9.4, 11, NA, 13, 10.5, 12.31, 11.18, 13.06, 10.71, 12.5, 15.03, 15.26, 13.22, 15.25, 13.03, 15.28, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9.2, NA, 9.6, NA, 10.93, NA, 11.19, NA, 10.86, 10.3, 9.4, 9.1, 9.1, 9.4, 9.7, 8.9, 9.86, 9.2, 9.2, NA, NA, NA, NA, NA, NA, NA, 10.9, NA, NA, 10.92, 10.69, 9.91, 10.01, NA, 10.66, NA, 10.38, NA, 11.4, 7.4, 7.3, 9, 9.6, NA, NA, 8, 9.3, NA, NA, 9.33, 9.9, 9.9, 11.2, 6.9, 7.3, 7, 8.7, 7.4, 8.6, 7.6, 9.24, 8.59, 8.6, 8.46, NA, 8.21, 9, 6.6, 8.5, 8.5, 10.2, 9.6, 9.55, NA, NA, 7.8, 9.6, NA, NA, 10.5, 11.4, 11.81, 9.7, NA, NA, 7.8, 8.9, NA, NA, NA, 12.29, NA, 11, NA, NA, NA, 11.11, NA, NA, 8.1, 8.1, 8.3, 10.2, NA, NA, 8.2, 11, NA, NA, NA, 8.7, NA, 8.9, NA, 11.3, NA, 12.2, NA, 12.5, 10.76, 14, 11.19, 15.4, NA, NA, 8.9, 10.9, NA, NA, 9.04, 9.74, 9.41, 9.43, 10.96, 10.93, 13.06, 10.31, 11.69, 8.66, 9.11, 8.87, 9.61, 8.99, 9.48, 9.58, 9.26, 9.29, 8.4, 8.5, 8.2, 8.3, 12.1, 8.7, 13.9, 8.8, 7.79, 10.45, 9.56, 9.66, 10.55, 11.76, 9.31, 12.36, 9.33, 10.71, 13.03, 12.36, 11.88, 11.94, 12.83, 13.51, 12.54, 14.29, 11.43, 11.19, 11.4, 9.9, 13.21, 11.1, 12.75, 12.03, 11.55, 10.3, 10.26, 10.31, 8.9, 8.8, 9.12, 10.35, 9.2, 9.3, 8.9, 7.7, 8.51, 8.2, 8.2, 8.54, 8.6, NA, 8, 8.5, 8.84, 8.22, 9.78, 7.8, 7.5, 7.7, 7.7, 9.68, 8.1, 8.21, 7.91, 8.11, 9.21, 9.01, 9.89, 8.2, 8.56, 10.19, 9.1, 9, 10.46, 8.7, 10.16, 8.9, 8.7, 9.6, 7.76, 7.76, 8.51, 10.26, 7.2, 11.71, 11.43, 11.24, 7.3, 9.13, 8.74, 8.81, 8.61, 8.63, 9.43, 8.93, 9.13, 9.33, 7.47, 7.21, 7.71, 8.28, 7.48, NA, 7.44, 8.81, 7.42, 7.25, 6.1, 8.74, 8.51, 6.7, 8.76, 6.2, 7.94, 8.51, 6.8, 13.03, 13.09, 12.9, 13.34, 13.07, 12.02, 12.94, 12, 12.61, 9.96, 8.79, 8.91, 9.2, 8.73, 8.61, 7.89, 8.17, 11.71, 8.99, 11.35, 10.36, 9.67, 8.86, 10.2, 11.17, 12.75, 12.49, 7.6, 9.62, 8.1, 9.93, 12.4, NA, NA, 8.3, 9.95, 7.4, 9.21, 9.34, 10.09, 7.9, 9.64, 7.6, 10.19, 12.65, 10.3, 10.3, 11, 11.66, 16, 11, 12.7, 11, 11.4, 11.49, 12.79, 16.65, NA, 11.75, 12.94, 13.3, 11.3, 9.86, 10.9, 12.08, 11, 9.99, 12.81, 12.36, NA, NA, 7.66, 6.5, 6.3, 6.4, 7, 7.1, 8.48, 6.8, 7.75, 12.97, 12.88, 12.49, 12.59, 12.83, 11.59, 8.9, 13.93, 13.35, 13.63, 14.64, 13.53, 13.64, 13.68, 13.38, 13.97, 12.98, 12.35, 12.89, 9.54, 9.3, 10.16, 10.71, 11.95, 12.03, 9.26, 10.15, 10.26, 6.7, 6.6, 7, 6.3, 7.76, 8.21, 7.7, 7.6, 13.49, 12.2, NA, 12.76, 12.78, 12.5, 13.57, 12.3, 12.84, 15.85, 11.26, 9.4, 11.16, 10.69, 11.43, 10.17, 10.51, 13.27, 11.39, 10.9, 10.54, NA, 10, 11.64, 10.6, 10.1, NA, 11.29, 7.61, 7.3, 7, 9.3, 13.33, 8.01, 8.16, 7.1, 9.91, 8.08, 11.33, 7.4, 10.39, 9, 11.5, 10.68, 8.53, 9.3, 11.19, 15.62, 11.02, 10.3, 9.7, 11.3, 10.5, 10.84, 13.86, 7.9, 7.6, 9.46, 7.9, 7.8, 9.33, 9.79, 7.7, 8.5, 8.3, 8.2, 8.1, 8.1, 10.2, 7.9, 8.3, 9.56, 9.34, 8.6, 9.6, 9.27, 8.1, 11.8, 9.74, 8.9, 8.3, 9.7, 7.6, 7.2, 9.21, 7.8, 7, 7.1, 8.1, 8.85, 9.4, 9.91, 9.44, 10.06, 8.6, 10.2, 10.55, NA, NA, 12.79, NA, NA, 9.75, 13.11, 14.54, NA, 14.36, 10.18, 14, 12.1, 15.26, NA, 10.99, 9.59, 10.9, 10.81, 9.3, 8.2, 8.75, 9.6, 8.9, 11.11, 11, 12, 10.9, 10.96, 8.99, 12.1, 11.76, 12.83, 11.1, 9.12, 8.54, 7.5, 9.01, 10.16, 11.71, 9.43, NA, 8.76, 13.07, 8.73, 8.86, 12.4, 7.9, 16, 11.75, 12.81, 7.1, 11.59, 13.38, 11.95, 7.76, 12.5, 11.43, 11.64, 13.33, 9, 9.7, 7.8, 10.2, 11.8, 7, 10.2, 14.54)
#1.2) Height, is the column to be filled
Pos.table<-data.frame(Pos=Pos,Height=NA)


2) dataframe with theoretical values



#2.1) the whole range of values that "Pos" can get
Source<- seq(0,17,0.01) #possible values that weight can get [0,17]
#2.2)height.0, the adjusted value of Height according to the Loop below
Table.match<- data.frame(Source=Source,Height.0=NA)
# loop for Source (real values)
for (i in 1:dim(Table.match)[1])
{
Table.match[i,"Height.0"] <- -57.5+5*(Table.match[i,"Source"])
}


2) Problem



The following Loop looks for respective matches.

for (i in 1:dim(Pos.table)[1])
{
H.i<-match(Pos.table[i,"Pos"], Table.match[,"Source"], nomatch = 0)
Pos.table[i,"Height"] <-ifelse(H.i,Table.match[H.i,"Height.0"],0)
# Rev.table[i,"Rev.Prot"]<-Rev.table[i,"Rev.Prot"]*Rev.table[i,"Yield"]
}


However, some values ares disregarded. for example, position 15 and 20 (among many others):

# both return NAs
match(15.03, Table.match[,"Source"])
match(15.28, Table.match[,"Source"])


Could you please advice me on how to overcome this problem?

Answer

I agree with Nicole that exact comparison between floating numbers should be avoided.

To solve that, I've just added a round() with 2 significant digits in the code:

for (i in 1:dim(Pos.table)[1])
{
  H.i<-match(round(Pos.table[i,"Pos"],2), round(Table.match[,"Source"],2), nomatch = 0)
  Pos.table[i,"Height"] <-ifelse(H.i,Table.match[H.i,"Height.0"],0)
  # Rev.table[i,"Rev.Prot"]<-Rev.table[i,"Rev.Prot"]*Rev.table[i,"Yield"]
}

I guess this solves the problem.

Comments