ali_m - 1 year ago 80

Python Question

In the process of using joblib to parallelize some model-fitting code involving Theano functions, I've stumbled across some behavior that seems odd to me.

Consider this very simplified example:

`from joblib import Parallel, delayed`

import theano

from theano import tensor as te

import numpy as np

class TheanoModel(object):

def __init__(self):

X = te.dvector('X')

Y = (X ** te.log(X ** 2)).sum()

self.theano_get_Y = theano.function([X], Y)

def get_Y(self, x):

return self.theano_get_Y(x)

def run(niter=100):

x = np.random.randn(1000)

model = TheanoModel()

pool = Parallel(n_jobs=-1, verbose=1, pre_dispatch='all')

# this fails with `TypeError: can't pickle instancemethod objects`...

results = pool(delayed(model.get_Y)(x) for _ in xrange(niter))

# # ... but this works! Why?

# results = pool(delayed(model.theano_get_Y)(x) for _ in xrange(niter))

if __name__ == '__main__':

run()

I understand why the first case fails, since

`.get_Y()`

`TheanoModel`

`X`

`Y`

`theano_get_Y()`

`__init__()`

`TheanoModel`

`theano_get_Y()`

`TheanoModel`

`X`

`Y`

`TheanoModel`

Can anyone explain what's going on here?

Just to illustrate why I think this behaviour is particularly weird, here are a few examples of some other callable member objects that don't take

`self`

`from joblib import Parallel, delayed`

import theano

from theano import tensor as te

import numpy as np

class TheanoModel(object):

def __init__(self):

X = te.dvector('X')

Y = (X ** te.log(X ** 2)).sum()

self.theano_get_Y = theano.function([X], Y)

def square(x):

return x ** 2

self.member_function = square

self.static_method = staticmethod(square)

self.lambda_function = lambda x: x ** 2

def run(niter=100):

x = np.random.randn(1000)

model = TheanoModel()

pool = Parallel(n_jobs=-1, verbose=1, pre_dispatch='all')

# # not allowed: `TypeError: can't pickle function objects`

# results = pool(delayed(model.member_function)(x) for _ in xrange(niter))

# # not allowed: `TypeError: can't pickle function objects`

# results = pool(delayed(model.lambda_function)(x) for _ in xrange(niter))

# # also not allowed: `TypeError: can't pickle staticmethod objects`

# results = pool(delayed(model.static_method)(x) for _ in xrange(niter))

# but this is totally fine!?

results = pool(delayed(model.theano_get_Y)(x) for _ in xrange(niter))

if __name__ == '__main__':

run()

None of them are pickleable with the exception of the

`theano.function`

Answer

Theano functions aren't python functions. Instead they are python objects that override `__call__`

. This means that you can call them just like a function but internally they are really objects of some custom class. In consequence, you can pickle them.