Kinmarui Kinmarui - 24 days ago 12
R Question

R padding time series with grouping

I have data frame with columns action, type, project_id, week, events_in_time

I want to run some analysis on each subgroup defined by action, type.
Currently I have a problem with padding time series (column week). For some projects I don't have any entry for given week.

How can I add 0 values in events_in_time for missing weeks in all projects?

I tried merging as described here: https://bocoup.com/weblog/padding-time-series-with-r by generating all weeks and merging it but nothing happens. I understand that I would need probably to generate this for all projects but I can't find how to do it.
what i did:

all.week.frame=data.frame(week=seq(0,12)) # i now it only fills first 12 weeks
merged=merge(data, all.week.frame, all=T)


example data:
http://pastebin.com/eXbFPFLj

save to file and load with

data= read.table("merged.csv", header = TRUE, sep = ",")

Answer

I think that complete from tidyr is what you are looking for. It takes a data.frame, followed by columns to complete (things to make sure all match up), then a list of what values to insert if the combination is missing. Here, it takes df, makes all combinations of week and project_id, then fills your only other column (events_in_time) with 0 whenever there is no entry.

complete(df, week, project_id, fill = list(events_in_time = 0))
Comments