Reading the Python cookbook at the minute and currently looking at generators. I'm finding it hard to get my head round.
As I come from a Java background, is there a Java equivalent? The book was speaking about 'Producer / Consumer', however when I hear that I think of threading.
Can anyone explain what a generator is and why you would use it? Without quoting any books, obviously (unless you can find a decent, simplistic answer direct from a book). Perhaps with examples, if you're feeling generous!
Note: this post assumes Python 3.x syntax.†
A generator is simply a function which returns an object on which you can call
next, such that for every call it returns some value, until it raises a
StopIteration exception, signaling that all values have been generated. Such an object is called an iterator.
Normal functions return a single value using
return, just like in Java. In Python, however, there is an alternative, called
yield anywhere in a function makes it a generator. Observe this code:
>>> def myGen(n): ... yield n ... yield n + 1 ... >>> g = myGen(6) >>> next(g) 6 >>> next(g) 7 >>> next(g) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
As you can see,
myGen(n) is a function which yields
n + 1. Every call to
next yields a single value, until all values have been yielded.
for loops call
next in the background, thus:
>>> for n in myGen(6): ... print(n) ... 6 7
Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:
>>> g = (n for n in range(3, 5)) >>> next(g) 3 >>> next(g) 4 >>> next(g) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
Note that generator expressions are much like list comprehensions:
>>> lc = [n for n in range(3, 5)] >>> lc [3, 4]
Observe that a generator object is generated once, but its code is not run all at once. Only calls to
next actually execute (part of) the code. Execution of the code in a generator stops once a
yield statement has been reached, upon which it returns a value. The next call to
next then causes execution to continue in the state in which the generator was left after the last
yield. This is a fundamental difference with regular functions: those always start execution at the "top" and discard their state upon returning a value.
There are more things to be said about this subject. It is e.g. possible to
send data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.
Now you may ask: why use generators? There are a couple of good reasons:
Generators allow for a natural way to describe infinite streams. Consider for example the Fibonacci numbers:
>>> def fib(): ... a, b = 0, 1 ... while True: ... yield a ... a, b = b, a + b ... >>> import itertools >>> list(itertools.islice(fib(), 10)) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
This code uses
itertools.islice to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the
itertools module, as they are essential tools for writing advanced generators with great ease.
† About Python <=2.6: in the above examples
next is a function which calls the method
__next__ on the given object. In Python <=2.6 one uses a slightly different technique, namely
o.next() instead of
next(o). Python 2.7 has
.next so you need not use the following in 2.7:
>>> g = (n for n in range(3, 5)) >>> g.next() 3