skylobo skylobo - 1 month ago 12
R Question

R: NA substitution between data frame based on index

I have this data

df.1
:

month a b c
1 0 0.000000000 0.000000000
2 0 0.000000000 0.001503194
3 0 0.000000000 0.000000000
4 0 0.000000000 0.000000000
5 0 0.000000000 0.000000000
6 0 0.000000000 0.000000000
7 0 0.000000000 0.000000000
8 0 0.000000000 0.000000000
9 0 0.000000000 0.000000000
10 0 0.000000000 0.000000000
11 NA NA NA
12 NA NA NA
1 0 0.000000000 0.000000000
2 0 0.001537279 0.006917756
3 0 0.000000000 0.003669725
4 0 0.000000000 0.000000000
5 0 0.000000000 0.000000000
6 0 0.000000000 0.000000000
7 0 0.000000000 0.000000000
8 0 0.000000000 0.000000000
9 0 0.000000000 0.000000000
10 0 0.000000000 0.000000000
11 0 0.000000000 0.013513514
12 NA NA NA


and this data
df.2
:

month a b c
1 0.03842077 0.002266291 0.000000000
2 0.01359501 0.001027937 0.000000000
3 0.08631519 0.008732519 0.001376147
4 0.26564710 0.083635347 0.019053692
5 0.34839088 0.152203121 0.021010075
6 0.31767367 0.152029019 0.029397773
7 0.31507761 0.110973916 0.023445471
8 0.29773872 0.096458381 0.026745770
9 0.31226976 0.109342562 0.023996392
10 0.23841220 0.081582743 0.021674228
11 0.04379016 0.003519300 0.000000000
12 0.02244389 0.002493766 0.000000000


I would to subsitute the value NA (and only NA) in
df.1
[,2:4] with value in
df.2
[,2:4] when the index in column 1 (
month
) is the same. I tried with this code:

res_new <- data.frame(matrix(nrow=nrow(df.1),ncol=3))
for (n in 1:12){
res_new <- data.frame(ifelse(is.na(df.1[which(df.1[,1] == n),2:4])==TRUE,df.2[which(df.2[,1] == n),2:4],df.1[,n]))

}


but the result is a big new matrix where each NA value in
df.1
is substitued with all value in
df.2


How can do it?
(Clearly my true data frame are more big!!)

lmo lmo
Answer Source

Assuming that you have complete rows that have missing values that you want to fill in, you can do this is two steps using which and match.

# find the location of the missing rows in df
missRows <- which(!complete.cases(df.1))
# fill in missing rows with rows in df.2 with matching months
df.1[missRows, ] <- df.2[match(df.1$month[missRows], df.2$month, nomatch=0),]

Note that missing rows are identified with !complete.cases. Also, the nomatch=0 argument is used in order to ignore instances where no match is found.