Akhil Nair Akhil Nair - 1 month ago 13
R Question

data.table filter list column for empty values

Can I filter by a list-column in a

data.table
for the rows with empty lists?

library(data.table)
dt = data.table(a = c(1, 2, 3), b = list(c("A", "B"), character(0), c("C", "D", "E")))

> dt
a b
1: 1 1,2
2: 2
3: 3 1,2,3


i.e. the expected result is

> dt[filter(b)]
a b
1: 2


The obvious filtering doesn't work

> dt[length(b) == 0]
Empty data.table (0 rows) of 2 cols: a,b

> dt[length(b[[1]]) == 0]
Empty data.table (0 rows) of 2 cols: a,b


I thought I might be able to define a function to result in the right boolean value, but as I have to use a group by to make it actually work, it doesn't work in the filter argument

is_null_list = function(l) is.list(l) & length(l[[1]]) == 0

> dt[, is_null_list(b), a]
a V1
1: 1 FALSE
2: 2 TRUE
3: 3 FALSE

> dt[is_null_list(b)]
Empty data.table (0 rows) of 2 cols: a,b


I guess the more general question is also, can any filtering be done on
data.table
list columns? I suspect the answer is no as you can't key by a list, but thought it was worth asking.

Thanks

Answer

You can filter by the length of each element of a list column with lengths. For example,

dt[ lengths(b) == 0L ]