Dror Hilman Dror Hilman - 10 months ago 260
Python Question

How can we use tqdm in a parallel execution with joblib?

I want to run a function in parallel, and wait until all parallel nodes are done, using joblib. Like in the example:

from math import sqrt
from joblib import Parallel, delayed
Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))

But, I want that the execution will be seen in a single progressbar like with tqdm, showing how many jobs has been completed.

How would you do that?

Answer Source

If your problem consists of many parts, you could split the parts into k subgroups, run each subgroup in parallel and update the progressbar in between, resulting in k updates of the progress.

This is demonstrated in the following example from the documentation.

>>> with Parallel(n_jobs=2) as parallel:
...    accumulator = 0.
...    n_iter = 0
...    while accumulator < 1000:
...        results = parallel(delayed(sqrt)(accumulator + i ** 2)
...                           for i in range(5))
...        accumulator += sum(results)  # synchronization barrier
...        n_iter += 1