I'm trying to use the shelve python module to save my session output and reload it later, but I have found that if I have defined functions then I get an error in the reloading stage. Is there a problem with the way I am doing it? I based my code on an answer at How can I save all the variables in the current python session? .
Here's some simple code that reproduces the error:
def test_fn(): #simple test function
my_shelf = shelve.open('test_shelve','n')
for key in globals().keys():
my_shelf[key] = globals()[key]
except: #__builtins__, my_shelf, and imported modules cannot be shelved.
ls -lh test_shelve*
-rw-r--r-- 1 user group 22K Aug 24 11:16 test_shelve.bak
-rw-r--r-- 1 user group 476K Aug 24 11:16 test_shelve.dat
-rw-r--r-- 1 user group 22K Aug 24 11:16 test_shelve.dir
my_shelf = shelve.open('test_shelve')
for key in my_shelf:
AttributeError Traceback (most recent call last)
<ipython-input-4-deb481380237> in <module>()
----> 1 print my_shelf['test_fn']
/home/user/anaconda2/envs/main/lib/python2.7/shelve.pyc in __getitem__(self, key)
120 except KeyError:
121 f = StringIO(self.dict[key])
--> 122 value = Unpickler(f).load()
123 if self.writeback:
124 self.cache[key] = value
AttributeError: 'module' object has no attribute 'test_fn'
You can't use
pickle, the actual protocol used by
shelve) to store executable code, no.
What is stored is a reference to the function (just the location where the function can be imported from again). Code is not data, only the fact that you referenced a function is data here. Pickle expects to be able to load the same module and function again when you load the stored information.
The same would apply to classes; if you pickle a reference to a class, or pickle an instance of a class, then only the information to import the class again is stored (to re-create the reference or instance).
All this is done because you already have a persisted and loadable representation of that function or class: the module that defines them. There is no need to store another copy.
This is documented explicitly in the What can be pickled and unpickled? section:
Note that functions (built-in and user-defined) are pickled by “fully qualified” name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.
To go into some more detail for your specific example: The main script that Python executes is called the
__main__ module, and you shelved the
__main__.test_fn function. What is stored then is simply a marker that signals you referenced a global and the import location, so something close to
test_fn are stored. When loading the shelved data again, upon seeing the
GLOBAL marker, the
pickle module tries to load the name
test_fn from the
__main__ module. Since your second script is again loaded as
__main__ but doesn't have a
test_fn global, loading the reference fails.