MichaelMaggs MichaelMaggs - 3 months ago 19
Python Question

Understanding a memoization decorator as a closure

I'm trying to learn about Python decorators, and I'd like to understand in more detail exactly how closure applies, for example in this memoization context:

def memoize(f):
memo = {}
def helper(x):
if x not in memo:
memo[x] = f(x)
return memo[x]
return helper

@memoize
def fib(n):
if n in (0,1):
return n
return fib(n - 1) + fib(n - 2)


I understand that
memoize
returns functions that are bound to
memo
values in the enclosing scope of
helper
, even when the program flow is no longer in that enclosing scope. So, if
memoize
is called repeatedly it will return a varying sequence of functions depending upon the current values of
memo
. I also understand that
@memoize
is syntactic sugar that causes calls to
fib(n)
to be replaced with calls to
memoize(fib(n))
.

Where I am struggling is how the called value of
n
in
fib(n)
is effectively converted to the
x
in
helper(x)
. Most tutorials on closures don't seem to make this point explicit, or they rather vaguely say that one function ‘closes over’ the other, which just makes it sound like magic. I can see how to use the syntax, but would like to have a better grasp of exactly what is happening here and what objects and data are being passed around as the code executes.

Answer

You've misunderstood what the @memoize decorator does. The decorator is not called each time you call fib. The decorator is called just once per function that is decorated.

The original fib function is replaced by the helper() function, with the original function object available as the f closure, together with the memo closure:

>>> def fib(n):
...     if n in (0,1):
...         return n
...     return fib(n - 1) + fib(n - 2)
...
>>> fib
<function fib at 0x1023398c0>
>>> memoize(fib)
<function helper at 0x102339758>

This gives the replacement helper function one closure, and each time you call that helper() function the same memo dictionary is used to look up already-produced results for the current value of x.

You can see this all work in the interpreter:

>>> @memoize
... def fib(n):
...     if n in (0,1):
...         return n
...     return fib(n - 1) + fib(n - 2)
...
>>> fib
<function helper at 0x10232ae60>

Note the function name there! That's helper. It has a closure:

>>> fib.__closure__
(<cell at 0x10232d9f0: function object at 0x10232ade8>, <cell at 0x10232da98: dict object at 0x102353050>)

A function object and a dictionary, those are f and memo, respectively. Access the cell contents with .cell_contents:

>>> fib.__closure__[0].cell_contents
<function fib at 0x10232ade8>
>>> fib.__closure__[1].cell_contents
{}

That's the original decorated function and the empty dictionary alright. Let's use this function:

>>> fib(0)
0
>>> fib(1)
1
>>> fib.__closure__[1].cell_contents
{0: 0, 1: 1}

That's the resulting values for x set to 0 and 1 being stored in the memo dictionary. Next time I call fib with either value, the memo is used. New x values are calculated and added:

>>> fib(2)
1
>>> fib.__closure__[1].cell_contents
{0: 0, 1: 1, 2: 1}

You can see how well that works by timing how long a larger number takes to calculate:

>>> start = time.clock(); fib(500); print(format(time.clock() - start, '.10f'))
139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125L
0.0008390000
>>> start = time.clock(); fib(500); print(format(time.clock() - start, '.10f'))
139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125L
0.0000220000

>>> start = time.clock(); fib(500); print(format(time.clock() - start, '.10f'))
139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125L
0.0000190000

After the first call, looking up the memo is 40 times as fast as all the intermediate results have been memoized:

>>> len(fib.__closure__[1].cell_contents)
501