Cryo Cryo - 2 months ago 12x
Python Question

Python multiprocessing raises IndexError

I've developed a utility using python/cython that sorts CSV files and generates stats for a client, but invoking seems to raise an exception before my mapped function has a chance to execute. Sorting a small number of files seems to function as expected, but as the number of files grows to say 10, I get the below IndexError after calling Does anyone happen to recognize the below error? Any help is greatly appreciated.

While the code is under NDA, the use-case is fairly simple:

Code Sample:

def sort_files(csv_files):
pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=pool_size)
sorted_dicts =, csv_files, 1)
return sorted_dicts

def sort_file(csv_file):
print 'sorting %s...' % csv_file
# sort code


File "generic.pyx", line 17, in generic.sort_files (/users/cyounker/.pyxbld/temp.linux-x86_64-2.7/pyrex/generic.c:1723)
sorted_dicts =, csv_files, 1)
File "/usr/lib64/python2.7/multiprocessing/", line 227, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.7/multiprocessing/", line 528, in get
raise self._value
IndexError: list index out of range


The IndexError is an error you get somewhere in sort_file(), i.e. in a subprocess. It is re-raised by the parent process. Apparently multiprocessing doesn't make any attempt to inform us about where the error really comes from (e.g. on which lines it occurred) or even just what argument to sort_file() caused it. I hate multiprocessing even more :-(