Jimmy Jimmy - 10 days ago 8
R Question

R expand.grid with row restrictions

I have a numeric vector x of length N and would like to create a vector of the within-set sums of all of the following sets: any possible combination of the x elements with at most M elements in each combination. I put together a slow iterative approach; what I am looking for here is a way without using any loops.

Consider the approach I have been taking, in the following example with N=5 and M=4

M <- 4
x <- 11:15
y <- as.matrix(expand.grid(rep(list(0:1), length(x))))
result <- y[rowSums(y) <= M, ] %*% x


However, as N gets large (above 22 for me), the expand.grid output becomes too big and gives an error (replace x above with x <- 11:55 to observe this). Ideally there would be an expand.grid function that permits restrictions on the rows before constructing the full matrix, which (at least for what I want) would keep the matrix size within memory limits.

Is there a way to achieve this without causing problems for large N?

Answer

Try this:

c(0, unlist(lapply(1:M, function(k) colSums(combn(x, k)))))

It generates the same result as with your expand.grid approach, shown below for the test data.

M <- 4
x <- 11:15

# expand.grid approach
y <- as.matrix(expand.grid(rep(list(0:1), length(x))))
result <- y[rowSums(y) <= M, ] %*% x

# combn approach
result1 <- c(0, unlist(lapply(1:M, function(k) colSums(combn(x, k)))))

all(sort(result[,1]) == sort(result1))
# [1] TRUE

This should be fast (it takes 0.227577 secs on my machine, with N=22, M=4):

x <- 1:22 # N = 22
M <- 4
c(0, unlist(lapply(1:M, function(k) colSums(combn(x, k)))))
# [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22  3  4  5  6  7 

you may want to choose the unique values of the sums with

unique(c(0, unlist(lapply(1:M, function(k) colSums(combn(x, k))))))