I have a list of calculations I need to run. I'm parallelizing them using
from pathos.multiprocessing import ProcessingPool
pool = ProcessingPool(nodes=7)
values = pool.map(helperFunction, someArgs)
def __init__(self, fileName, attr=, folder='../cache/'):
self.folder = folder
if len(attr) > 0:
attr = self.attrToName(attr)
attr = ''
self.fileNameNaked = fileName
self.fileName = fileName + attr
def write(self, objects):
with open(self.getFile(), 'wb') as output:
for object in objects:
pickle.dump(object, output, pickle.HIGHEST_PROTOCOL)
>>> pickle.dump(objects, output, pickle.HIGHEST_PROTOCOL)
Traceback (most recent call last):
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-4d2cbb7c63d1>", line 1, in <module>
pickle.dump(objects, output, pickle.HIGHEST_PROTOCOL)
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 1376, in dump
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 224, in dump
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 331, in save
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 396, in save_reduce
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/site-packages/dill/dill.py", line 1203, in save_type
File "/usr/local/anaconda2/envs/myenv2/lib/python2.7/pickle.py", line 754, in save_global
(obj, module, name))
PicklingError: Can't pickle <class '__main__.Parameters'>: it's not found as __main__.Parameters
Straight from the Python docs.
12.1.4. What can be pickled and unpickled? The following types can be pickled:
- None, True, and False
- integers, floating point numbers, complex
- strings, bytes, bytearrays
- tuples, lists, sets, and
- dictionaries containing only picklable objects functions defined at the top level of a module (using def, not lambda)
- built-in functions defined at the top level of a module
- classes that are defined at the top level of a module
- instances of such classes whose
__dict__or the result of calling
__getstate__()is picklable (see section Pickling Class Instances for details).
Everything else can't be pickled. In your case, though it's very hard to say given the excerpt of your code, I believe the problem is that the class
Parameters is not defined at the top level of the module, hence its instances can't be pickled.
The whole point of using
pathos.multiprocessing (or its actively developing fork
multiprocess) instead of the built-in
multiprocessing is to avoid
pickle, because there are far too many things the later can't dump.
dill instead of
pickle. And if you want to debug a worker, you can use trace.
NOTE As Mike McKerns (the main contributor of
multiprocess) rightfully noticed, there are cases that even
dill can't handle, though it will be hard to formulate some universal rules on that matter.