justin - 6 months ago 36

Python Question

I am trying to generate an array of all combinations of an arbitrary number of arrays. From the generated array, I would like to add an constraint that the sum of the numbers must lie between two bounds (say 'lower' and 'upper')

One method of doing this is to use cartersian, sum the elements, and select the ones that fall within the lower and upper bounds. However, the main limitation is that it is possible to run out of memory given a large number of input arrays. Another method is to use itertools.product:

`import itertools`

import numpy as np

def arraysums(arrays,lower,upper):

p = itertools.product(*arrays)

r = list()

for n in p:

s = sum(n)

if lower <= s <= upper:

r.append(n)

return r

N = 8

a = np.arange(N)

b = np.arange(N)-N/2

arraysums((a,b),lower=5,upper=6)

which returns a result like this:

`[(2, 3),`

(3, 2),

(3, 3),

(4, 1),

(4, 2),

(5, 0),

(5, 1),

(6, -1),

(6, 0),

(7, -2),

(7, -1)]

This method is memory efficient, but can be very slow if the arrays are large, such as this example which runs in the 10's of minutes:

`a = np.arange(32.)`

arraysums(6*(a,),lower=10,upper=20)

I'm looking for a faster method.

Answer

You could use recursion. The main advantage here is that you can *short-circuit* the enumeration of tuples at each stage.

For example, if `item`

has been selected from the first array, then the new lower and upper limits for the rest of the arrays should be `lower-item`

and `upper-item`

, and you can automatically throw out any value in the other arrays that is bigger than `upper-item`

.
This intelligently reduces the size of your search space at each level of the recursion.

```
import itertools
def arraysums_recursive(arrays, lower, upper):
if len(arrays) <= 1:
result = [(item,) for item in arrays[0] if lower <= item <= upper]
else:
result = []
for item in arrays[0]:
subarrays = [[item2 for item2 in arr if item2 <= upper-item]
for arr in arrays[1:]]
result.extend(
[(item,)+tup for tup in arraysums(subarrays, lower-item, upper-item)])
return result
def arraysums(arrays,lower,upper):
p = itertools.product(*arrays)
r = list()
for n in p:
s = sum(n)
if lower <= s <= upper:
r.append(n)
return r
a = list(range(32))
```

For this test case, `arraysums_recursive`

is over 67x faster than `arraysums`

:

```
In [70]: %time arraysums_recursive(6*(a,),lower=10,upper=20)
CPU times: user 3.67 s, sys: 0 ns, total: 3.67 s
Wall time: 3.68 s
In [73]: %time arraysums(6*(a,),lower=10,upper=20)
CPU times: user 4min 8s, sys: 0 ns, total: 4min 8s
Wall time: 4min 8s
```