Udy Udy - 20 days ago 7
R Question

Cannot add dates of zero demand to zoo time series due to duplicate dates

My apologies in advance for any non-compliance with the rules of posting a question. The data table below is a sample of what I want to convert into a time series.

> Materials
MaterialID Date Quantity
1 2011-01-04 13
1 2011-01-04 5
2 2011-01-07 9
3 2011-01-09 3
3 2011-01-11 10


It consists of transaction entries for several Material items between 2011- 2014.The date range for the entire data set is from 4th Jan, 2011 - 31st Dec 2014. I want to create a transaction entry for each material within this period while accounting for the missing dates by setting the Quantity variable to zero for the missing dates. To put it another way the outcome I desire is that there will be an entry for each Material in the data set for every date between 4th Jan, 2011 - 31st Dec 2014 as shown below:

Date MaterialID_1 MaterialID_2 MaterialID_3
2011-01-04 13 0 0
2011-01-04 5 0 0
2011-01-05 0 0 0
2011-01-06 0 0 0
2011-01-07 0 9 0
2011-01-08 0 0 0
2011-01-09 0 0 3
2011-01-10 0 0 10
2011-01-11 0 0 0
. . . .
. . . .
. . . .
2014-12-31 0 0 0


I have tried some of the methods I have seen in the forum such as Add months of zero demand to zoo time series, but because I have duplicated dates I get the error, "index entries in ‘order.by’ are not unique". I'd appreciate any advise or help I can get with this.

After getting the data into this format, my intention is to reshape the data set to do batch forecasting. Thanks.

See dput code below:

dput(Data)
structure(list(MaterialID = c(1L, 1L, 2L, 3L, 1L), Date = c("2011-01-04",
"2011-01-04", "2011-01-07", "2011-01-09", "2011-01-11"), Quantity = c(13L,
5L, 9L, 3L, 10L)), .Names = c("MaterialID", "Date", "Quantity"
), class = "data.frame", row.names = c(NA, -5L))

Answer

You can do this with a split-apply-combine operation using xts objects. Unlike zoo, xts objects allow duplicate indices.

# sample data
Data <- read.csv(text = "MaterialID,Date,Quantity
1,2011-01-04,13
1,2011-01-04,5
1,2011-05-06,9
1,2011-08-07,3
1,2011-12-08,10
2,2011-03-09,4
3,2011-02-10,7
3,2011-10-11,78
3,2014-31-12,32", as.is = TRUE)
# split data into groups by material id
dataByMaterialId <- split(Data, Data$MaterialID)
# create an xts object for each id
xts_list <- lapply(dataByMaterialId, function(id) {
  names <- list(NULL, paste0("Qty.", id$MaterialID[1]))
  xts(id$Quantity, as.Date(id$Date, "%Y-%d-%m"), dimnames = names)
})
# use do.call + merge to combine all your xts objects into one object
xts_merged <- do.call(merge, c(xts_list, fill = 0)())
#            Qty.1 Qty.2 Qty.3
# 2011-04-01    13     0     0
# 2011-04-01     5     0     0
# 2011-06-05     9     0     0
# 2011-07-08     3     0     0
# 2011-08-12    10     0     0
# 2011-09-03     0     4     0
# 2011-10-02     0     0     7
# 2011-11-10     0     0    78
# 2014-12-31     0     0    32