fonini fonini - 7 months ago 90
Python Question

I can "pickle local objects" if I use a derived class?

The

pickle
reference states that the set of objects which can be pickled is rather limited. Indeed, I have a function which returns a dinamically-generated class, and I found I can't pickle instances of that class:

>>> import pickle
>>> def f():
... class A: pass
... return A
...
>>> LocalA = f()
>>> la = LocalA()
>>> with open('testing.pickle', 'wb') as f:
... pickle.dump(la, f, pickle.HIGHEST_PROTOCOL)
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
AttributeError: Can't pickle local object 'f.<locals>.A'


Such objects are too complicated for
pickle
. Ok. Now, what's magic is that, if I try to pickle a similar object, but of a derived class, it works!

>>> class DerivedA(LocalA): pass
...
>>> da = DerivedA()
>>> with open('testing.pickle', 'wb') as f:
... pickle.dump(da, f, pickle.HIGHEST_PROTOCOL)
...
>>>


What's happening here? If this is so easy, why doesn't
pickle
use this workaround to implement a
dump
method that allows "local objects" to be pickled?

Answer

I think you did not read the reference you cite carefully. The reference also clearly states that only the following objects are pickleable:

  • functions defined at the top level of a module (using def, not >lambda)
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module

Your example

>>> def f():
...     class A: pass
...     return A

does not define a class at the top level of a module, it defines a class within the scope of f(). pickle works on global classes, not local classes. This automatically fails the pickleable test.

DerivedA is a global class, so all is well.

As for why only top-level (global to you) classes and functions can't be pickled, the reference answers that question as well (bold mine):

Note that functions (built-in and user-defined) are pickled by “fully qualified” name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.

Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply.

So there you have it. pickle only serialises objects by name reference, not by the raw instructions contained within the object. This is because pickle's job is to serialise object hierarchy, and nothing else.