Paul - 1 year ago 51

Python Question

I want find all possible ways to map a series of (contiguous) integers M = {0,1,2,...,m} to another series of integers N = {0,1,2,...,n} where m > n, *subject to the constraint that only contiguous integers in M map to the same integer in N.*

The following piece of python code comes close (

`start`

`stop`

`nbins`

`import itertools`

def find_bins(start, stop, nbins):

if (nbins > 1):

return list(list(itertools.product([range(start, ii)], find_bins(ii, stop, nbins-1))) for ii in range(start+1, stop-nbins+2))

else:

return [range(start, stop)]

E.g

`In [20]: find_bins(start=0, stop=5, nbins=3)`

Out[20]:

[[([0], [([1], [2, 3, 4])]),

([0], [([1, 2], [3, 4])]),

([0], [([1, 2, 3], [4])])],

[([0, 1], [([2], [3, 4])]),

([0, 1], [([2, 3], [4])])],

[([0, 1, 2], [([3], [4])])]]

However, as you can see the output is nested, and for the life of me, I cant find a way to properly amend the code without breaking it.

The desired output would look like this:

`In [20]: find_bins(start=0, stop=5, nbins=3)`

Out[20]:

[[(0), (1), (2, 3, 4)],

[(0), (1, 2), (3, 4)],

[(0), (1, 2, 3), (4)],

[(0, 1), (2), (3, 4)],

[(0, 1), (2, 3), (4)],

[(0, 1, 2), (3), (4)]]

Answer Source

I suggest a different approach: a partitioning into `n`

non-empty bins is uniquely determined by the `n-1`

distinct indices marking the boundaries between the bins, where the first marker is after the first element, and the final marker before the last element. `itertools.combinations()`

can be used directly to generate all such index tuples, and then it's just a matter of using them as slice indices. Like so:

```
def find_nbins(start, stop, nbins):
from itertools import combinations
base = range(start, stop)
nbase = len(base)
for ixs in combinations(range(1, stop - start), nbins - 1):
yield [tuple(base[lo: hi])
for lo, hi in zip((0,) + ixs, ixs + (nbase,))]
```

Then, e.g.,

```
for x in find_nbins(0, 5, 3):
print(x)
```

displays:

```
[(0,), (1,), (2, 3, 4)]
[(0,), (1, 2), (3, 4)]
[(0,), (1, 2, 3), (4,)]
[(0, 1), (2,), (3, 4)]
[(0, 1), (2, 3), (4,)]
[(0, 1, 2), (3,), (4,)]
```

Just noting that there's a more general underlying problem here: generating the ways to break an arbitrary sequence into `n`

non-empty bins. Then the specific question here is applying that to the sequence `range(start, stop)`

. I believe viewing it that way makes the code easier to understand, so here it is:

```
def gbins(seq, nbins):
from itertools import combinations
base = tuple(seq)
nbase = len(base)
for ixs in combinations(range(1, nbase), nbins - 1):
yield [base[lo: hi]
for lo, hi in zip((0,) + ixs, ixs + (nbase,))]
def find_nbins(start, stop, nbins):
return gbins(range(start, stop), nbins)
```