I expected that in the case of multiple loops list iteration will be much faster than using a generator, and my code suggests this is false.
My understanding is (by operation I mean any expression defining an element):
from timeit import timeit
"""Return a list with powers of 2"""
results = 
for i in range(n):
"""Generator of powers of 2"""
for i in range(n):
def loop(iterator, n=1000):
"""Loop n times over iterable object"""
for _ in range(n):
for _ in iterator:
l = pow2_list(1000) # point to a list
g = pow2_gen(1000) # point to a generator
time_list = \
timeit("loop(l)", setup="from __main__ import loop, l", number=10)
time_gen = \
timeit("loop(g)", setup="from __main__ import loop, g", number=10)
print("Loops over list took: ", time_list)
print("Loops over generator took: ", time_gen)
Loops over list took: 0.20484769299946493
Loops over generator took: 0.0019217690005461918
>>> x = range(10)
Your generator is actually only looping once. Once created with
g stores a generator; the very first time through
loop, this generator is consumed, and emits
StopIteration. The other times through
g.next() just continues to throw
StopIteration, so, in effect
g represents an empty sequence.
To make the comparison more fair, you would need to re-create the generator every time you loop.
A further difficulty with the way you've approached this is that you're calling
append to build your list, which is likely the very slowest way to construct a list. More often, lists are built with list comprehensions.
The following code lets us pick apart the timing a bit more carefully.
create_gen create lists and generators, respectively, using list comprehension and generator expressions.
time_loop is like your
loop method, while
time_apply is a version of
loop that re-creates the iterable each time through the loop.
def create_list(n=1000): return [2**i for i in range(n)] def create_gen(n=1000): return (2**i for i in range(n)) def time_loop(iterator, n=1000): for t in range(n): for v in iterator: pass def time_apply(create_fn, fn_arg, n=1000): for t in range(n): iterator = create_fn(fn_arg) time_loop(iterator, 1) print('time_loop(create_list): %.3f' % timeit("time_loop(create_list(1000))", setup="from __main__ import *", number=10)) print('time_loop(create_gen): %.3f' % timeit("time_loop(create_gen(1000))", setup="from __main__ import *", number=10)) print('time_apply(create_list): %.3f' % timeit("time_apply(create_list, 1000)", setup="from __main__ import *", number=10)) print('time_apply(create_gen): %.3f' % timeit("time_apply(create_gen, 1000)", setup="from __main__ import *", number=10))
Results on my box suggest that building a list (
time_apply(create_list)) is similar in time to (or maybe even faster than) building a generator (
time_loop(create_list): 0.244 time_loop(create_gen): 0.028 time_apply(create_list): 21.190 time_apply(create_gen): 21.555
You can see the same effect you've documented in your question, which is that
time_loop(create_gen) is an order of magnitude faster than
time_loop(create_list). Again, this is because the generator created is only being iterated over once, rather than the many loops over the list.
As you hypothesise, building a list once and iterating over it many times (
time_loop(create_list)) is faster than iterating over a generator many times (
time_apply(create_gen)) in this particular scenario.
The trade-off between list and generator is going to be strongly dependent on how big the iterator you're creating is. With 1000 items, I would expect lists to be pretty fast. With 100,000 items, things might look different.
print('create big list: %.3f' % timeit("l = create_list(100000)", setup="from __main__ import *", number=10)) print('create big gen: %.3f' % timeit("g = create_gen(100000)", setup="from __main__ import *", number=10))
Here I'm getting:
create big list: 209.748 create big gen: 0.023
Python uses up between 700 and 800 MB of memory building the big list; the generator uses almost nothing at all. Memory allocation and garbage cleanup are computationally expensive in Python, and predictably make your code slow; generators are a very simple way to avoid gobbling up your machine's RAM, and can make a big difference to runtime.