kjo - 1 year ago 64
Python Question

# Simple idiom to break an n-long list into k-long chunks, when n % k > 0?

In Python, it is easy to break an n-long list into k-size chunks if n is a multiple of k (IOW,

`n % k == 0`
). Here's my favorite approach (straight from the docs):

``````>>> k = 3
>>> n = 5 * k
>>> x = range(k * 5)
>>> zip(*[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
``````

(The trick is that
`[iter(x)] * k`
produces a list of k references to the same iterator, as returned by
`iter(x)`
. Then
`zip`
generates each chunk by calling each of the k copies of the iterator exactly once. The
`*`
before
`[iter(x)] * k`
is necessary because
`zip`
expects to receive its arguments as "separate" iterators, rather than a list of them.)

The main shortcoming I see with this idiom is that, when n is not a multiple of k (IOW,
`n % k > 0`
), the left over entries are just left out; e.g.:

``````>>> zip(*[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)]
``````

There's an alternative idiom that is slightly longer to type, produces the same result as the one above when
`n % k == 0`
, and has a more acceptable behavior when
`n % k > 0`
:

``````>>> map(None, *[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
>>> map(None, *[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, None)]
``````

At least, here the left over entries are retained, but the last chunk gets padded with
`None`
. If one just wants a different value for the padding, then
`itertools.izip_longest`
solves the problem.

But suppose the desired solution is one in which the last chunk is left unpadded, i.e.

``````[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14)]
``````

Is there a simple way to modify the
`map(None, *[iter(x)]*k)`
idiom to produce this result?

(Granted, it is not difficult to solve this problem by writing a function (see, for example, the many fine replies to How do you split a list into evenly sized chunks in Python? or What is the most "pythonic" way to iterate over a list in chunks?). Therefore, a more accurate title for this question would be "How to salvage the
`map(None, *[iter(x)]*k)`
idiom?", but I think it would baffle a lot of readers.)

I was struck by how easy it is to break a list into even-sized chunks, and how difficult (in comparison!) it is to get rid of the unwanted padding, even though the two problems seem of comparable complexity.

``````[x[i:i+k] for i in range(0,n,k)]