Parashar Parashar - 5 months ago 30
Python Question

Why do processes take long time to join even when the map call to pool is completed?

This is yet another question regarding

module in
Python 3.
5. My problem is that I know all the forked processed have done their job (I can see their result in
), the AsyncResult.result() returns True which means that jobs are completed but when I proceed with PoolObj.join(), it takes forever. I know I can PoolObj.terminate() and carry on with my life, but I want to know why the heck does this happen?

I'm using following code:

def worker(d):

def gen_data():
for i in range(int(1e6)):
yield i

if __name__ == "__main__":
queue = Queue(maxsize=-1)
pool = Pool(processes=12)
pool_obj_worker = pool.map_async(worker, gen_data(), chunksize=1)

print ('Lets run the workers...\n')
while True:
if pool_obj_worker.ready():
if pool_obj_worker.successful():
print ('\nAll processed successfully!') # I can see this quickly, so my jobs are done
print ('\nAll processed. Errors encountered!')
print (q.qsize()) # The size is right that means all workers have done their job
pool.join() # will get stuck here for long long time
print ('%d still to be processed' %

Am I doing it wrong? Please enlighten me. Or are the processes holding
have gone zombie?


The issue here is that you are using an extra Queue in your worker, other than the one fournished by Pool. When the processes finish their work, they will all join the FeederThread used in the multiprocessing.Queue and these calls will hang (probably because all the threads call join simultaneously and there can be some weird race conditions, it is not easy to investigate).

Adding multiprocessing.util.log_to_stderr(10) permits to reveal that your processes hang while joining the queue feeder thread.

To solve your issue, you can either use multiprocessing.SimpleQueue instead of multiprocessing.Queue (no hang in join as there is no feeder thread) or try using the method pool.unordered_imap which provides the same kind of behavior as what you seem to implement (give back an unordered generator containing the results returned by worker).