theMobDog theMobDog - 1 month ago 6
Python Question

Methods that take iterators instead of iterables

Regarding iterators and iterables (my observation only and please correct me if I am wrong):


  • Most constructors (of arrayish-types) take iterators as mass-constructor

  • iterators are explicitly made; or by using
    x in x for....

  • Many methods (mostly,
    itertools
    ) returns iterators (because their jobs is to iterate?)

  • Methods that take iterables take iterators. Is this true in all cases?

  • Methods that take iterators won't take iterables (reverse is not true)

  • The only method that explicitly takes an iterator seems to be
    next(..



Questions:


  • Are there other methods that take iterators?

  • What are the other ways to make iterators with syntax? eg:
    x in x for...

  • Why did the python creators leave
    next(..
    to be the only method taking iterators? They could easily make it into a method taking iterable with extra arguments(conditions)?


Answer

The language around iterators and iterables is a bit confusing. The main confusion comes from the term "iterable", which may or may not be a superset of "iterator", depending on how it's being used.

Here's how I'd categorize things:

An iterable is any object that can be iterated upon. That is, it has an __iter__() method that returns an iterator, or it is indexable with integers (raising an IndexError exception when they're out of range), which lets Python build an iterator for it automatically. This is a very broad category.

An iterator is an object that follows the iterator protocol. It has a __next__() method (spelled next in Python 2) that yields the next item, or raises StopIteration exception if there are no more values available. An iterator also must have an __iter__() method that returns itself, so all iterators are also iterable (since they meet the definition of "iterable" given above).

A non-iterator iterable is any iterable that is not an iterator. This is often what people mean when they use the term "iterable" in contrast to "iterator". A better term in many contexts might be "sequence", but that's a bit more specific (some non-sequence objects are non-iterator iterables, like dictionaries which allow iteration over their keys). The important feature of this category of objects is that you can iterate on them multiple times, and the iterators work independently of one another.

So to try to answer your specific questions:

There's rarely a good reason for any function to require an iterator specifically. Functions can usually be made to work just as well with any kind of iterable argument, either by calling iter() on the argument to get an iterator, or by use a for loop which creates the iterator behind the scenes.

The reverse is different. If a function requires a non-iterator iterable, it may need to iterate on the argument several times and so an iterator will not work properly. Functions in the Python standard library (and builtins) rarely have such a limitation though. If they need to iterate multiple times on an iterable argument, they'll often dump it into a sequence type (e.g. a list) at the start if it's not a sequence already.

Many functions return iterators. All generator objects are iterators, for instance (both those returned by generator functions and those created with generator expressions). File objects are also iterators (though they violate the iterator protocol a little bit since you can restart them after they're exhausted using their seek() method). And all the functions and types in the itertools module return iterators, but so do some builtins like map() (in Python 3).

The next() function is indeed unusual since specifically requires an iterator. This is because it's defined as a part of the iteration protocol itself. It's exactly equivalent to calling the __next__() method on the iterator, just nicer to read. It also has a two-argument form which suppresses the StopIteration exception that would otherwise be raised if the iterator is exhausted (it returns the default argument instead).

Comments