bill ou bill ou - 20 days ago 7
R Question

How to format a pivot like table in R that includes records for all time and id values?

I'm collecting logs in R with 3 columns:

week, probe, and number of observations.

There aren't record when there is no observation.

week=c(1,2,2,4)
probe=c("A","C","B","C")
obs=c(2,4,3,1)
logs=data.frame(week,probe,obs)

logs

week probe obs
1 A 2
2 C 4
2 B 3
4 C 1


I want to reformat the data so that it includes all weeks and all probes even if there was no observation, so that it looks like this:

week probe obs
1 A 2
1 B 0
1 C 0
1 D 0
2 A 0
2 B 0
2 C 3
2 D 4
3 A 0
3 B 0
3 C 0
3 D 0
4 A 0
4 B 0
4 C 1
4 D 0


I have the list of all probes here:

allprobes=c("A","B","C","D")


and I want to look at these weeks:

allweeks=c(1:4)


I've been looking at melt, cast, reshape, but I only manage to get 1 line per id or month...
as I actually want to keep the original format of the logs.
It seems easy enough at first but I'm now stuck...
Any advice on how to get the data formatted this way?

Thanks a lot for any help.

Answer

Two options in base R:

Use expand.grid and merge:

> fullFrame <- expand.grid(allweeks, allprobes)
> names(fullFrame) <- c("week", "probe")
> merge(fullFrame, logs, all = TRUE)
   week probe obs
1     1     A   2
2     1     B  NA
3     1     C  NA
4     1     D  NA
5     2     A  NA
6     2     B   3
7     2     C   4
8     2     D  NA
9     3     A  NA
10    3     B  NA
11    3     C  NA
12    3     D  NA
13    4     A  NA
14    4     B  NA
15    4     C   1
16    4     D  NA

expand.grid will create a data.frame of all the possible combinations of your "allprobes" and "allweeks" objects. Then, we rename the columns of that new data.frame to match the relevant columns from your "logs" data.frame when using merge. The argument all = TRUE tells merge to fill in the missing values with NA.

If you want zeroes instead of NA, proceed as follows:

fullFrame <- expand.grid(allweeks, allprobes)
names(fullFrame) <- c("week", "probe")
finalLogs <- merge(fullFrame, logs, all = TRUE)
finalLogs[is.na(finalLogs)] <- 0

Use xtabs after converting "week" and "probe" to factors

If you convert "week" and "probe" to factors that includes all the relevant levels, then you can simply use xtabs wrapped in data.frame:

logs$week <- factor(logs$week, levels=c(1, 2, 3, 4))
logs$probe <- factor(logs$probe, levels=c("A", "B", "C", "D"))
data.frame(xtabs(obs ~ week + probe, logs))
#    week probe Freq
# 1     1     A    2
# 2     2     A    0
# 3     3     A    0
# 4     4     A    0
# 5     1     B    0
# 6     2     B    3
# 7     3     B    0
# 8     4     B    0
# 9     1     C    0
# 10    2     C    4
# 11    3     C    0
# 12    4     C    1
# 13    1     D    0
# 14    2     D    0
# 15    3     D    0
# 16    4     D    0
Comments