Marcus Nunes Marcus Nunes - 3 years ago 176
R Question

Fill a data frame with NA in R

Suppose I have a dataset like myData below:

set.seed(1234)

Date <- seq(as.Date("1990-01-01"), as.Date("1990-12-01"), "months")
Date <- rep(Date, 5)

Species <- rep(c("cat", "lion", "tiger", "leopard", "cheetah"), each=12)

Measurement <- rnorm(60)

index <- sample(1:60, 10)

myData <- data.frame(Date[-index], Species[-index], Measurement[-index])


Notice myData doesn't have all the possible combinations between Date and Species. There are 10 missing rows. I want to create a new data frame (let's say myData2) where I have all the possible combinations between Date and Species, i.e., myData 2 will have 60 rows. The value of Measurement in myData2 should be its original value, if the particular combination of Date and Species is present in myData, or NA, if this combination is missing.

I'm trying to accomplish this with two nested for loops, but it is not working. I know I'm making mistakes, but I can't figure out what they are.

Answer Source

You are looking for the complete function from tidyr package which is designed exactly for your purpose:

tidyr::complete(myData, Date, Species)

# Source: local data frame [60 x 3]
# 
#          Date Species Measurement
#        (date)  (fctr)       (dbl)
# 1  1990-01-01     cat  -1.2070657
# 2  1990-01-01 cheetah  -0.5238281
# 3  1990-01-01 leopard  -2.1800396
# 4  1990-01-01    lion  -0.7762539
# 5  1990-01-01   tiger  -0.6937202
# 6  1990-02-01     cat   0.2774292
# 7  1990-02-01 cheetah  -0.4968500
# 8  1990-02-01 leopard  -1.3409932
# 9  1990-02-01    lion          NA
# 10 1990-02-01   tiger          NA
# ..        ...     ...         ...
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download