zeehio zeehio - 24 days ago 8
Python Question

zip iterators asserting for equal length in python

I am looking for a nice way to

zip
several iterables raising an exception if the lengths of the iterables are not equal.

In the case where the iterables are lists or have a
len
method this solution is clean and easy:

def zip_equal(it1, it2):
if len(it1) != len(it2):
raise ValueError("Lengths of iterables are different")
return zip(it1, it2)


However, if
it1
and
it2
are generators, the previous function fails because the length is not defined
TypeError: object of type 'generator' has no len()
.

I imagine the
itertools
module offers a simple way to implement that, but so far I have not been able to find it. I have come up with this home-made solution:

def zip_equal(it1, it2):
exhausted = False
while True:
try:
el1 = next(it1)
if exhausted: # in a previous iteration it2 was exhausted but it1 still has elements
raise ValueError("it1 and it2 have different lengths")
except StopIteration:
exhausted = True
# it2 must be exhausted too.
try:
el2 = next(it2)
# here it2 is not exhausted.
if exhausted: # it1 was exhausted => raise
raise ValueError("it1 and it2 have different lengths")
except StopIteration:
# here it2 is exhausted
if not exhausted:
# but it1 was not exhausted => raise
raise ValueError("it1 and it2 have different lengths")
exhausted = True
if not exhausted:
yield (el1, el2)
else:
return


The solution can be tested with the following code:

it1 = (x for x in ['a', 'b', 'c']) # it1 has length 3
it2 = (x for x in [0, 1, 2, 3]) # it2 has length 4
list(zip_equal(it1, it2)) # len(it1) < len(it2) => raise
it1 = (x for x in ['a', 'b', 'c']) # it1 has length 3
it2 = (x for x in [0, 1, 2, 3]) # it2 has length 4
list(zip_equal(it2, it1)) # len(it2) > len(it1) => raise
it1 = (x for x in ['a', 'b', 'c', 'd']) # it1 has length 4
it2 = (x for x in [0, 1, 2, 3]) # it2 has length 4
list(zip_equal(it1, it2)) # like zip (or izip in python2)


Am I overlooking any alternative solution? Is there a simpler implementation of my
zip_equal
function?

PS: I wrote the question thinking in Python 3, but a Python 2 solution is also welcome.

Answer

I can a simpler solution, use itertools.zip_longest() and raise an exception if the sentinel value used to pad out shorter iterables is present in the tuple produced:

from itertools import zip_longest

def zip_equal(*iterables):
    sentinel = object()
    for combo in zip_longest(*iterables, fillvalue=sentinel):
        if sentinel in combo:
            raise ValueError('Iterables have different lengths')
        yield combo

Unfortunately, we can't use zip() with yield from to avoid a Python-code loop with a test each iteration; once the shortest iterator runs out, zip() would advance all preceding iterators and thus swallow the evidence if there is but one extra item in those.