Chris Chris - 3 months ago 10
R Question

Shift a column of lists in data.table by group

I have a column of lists in R:

DT <- data.table(foo = c(list(c("a","b","c")), list(c("b","c")), list(c("a","b")), list(c("a"))), id = c(1,1,2,2))
DT
foo id
1: a,b,c 1
2: b,c 1
3: a,b 2
4: a 2


What I would like to do is replicate typical shift behavior to get:

foo id
1: b,c 1
2: NA 1
3: a 2
4: NA 2


For a normal column I would use shift, but this splits the lists into columns and shifts those (and flags a warning):

DT[ , shift(foo,1,type = "lead"), by = id]
id V1 V2
1: 1 b c
2: 1 c NA
3: 1 NA c
4: 2 b NA
5: 2 NA NA


If I wrap the shift call into a list, the return is a list but only the vector elements have been shifted:

DT[ , list(shift(foo,1,type = "lead")), by = id]
id V1
1: 1 b,c,NA
2: 1 c,NA
3: 2 b,NA
4: 2 NA

Answer

This has come up more than once. So I've gone ahead and added this feature. You'll have to use the development version at the moment though, v1.9.7.. see installation instructions here.

DT[, foo2 := shift(.(foo), type = "lead"), 
       by = id]
#      foo id foo2
# 1: a,b,c  1  b,c
# 2:   b,c  1   NA
# 3:   a,b  2    a
# 4:     a  2   NA

Just wrap foo for each group in a list. Note that it returns a list-of-list which works well with := as shown above.. If you're not adding/updating your data.table (which doesn't make much sense), then you'll have to extract the list element.

DT[, .(foo2 = shift(.(foo), type="lead")[[1L]]), 
        by = id]
#    id foo2
# 1:  1  b,c
# 2:  1   NA
# 3:  2    a
# 4:  2   NA

shift() is designed to play nicely with data.table's := syntax, since it returns the same number of rows all the time.

Comments