user642897 user642897 - 6 months ago 338
Python Question

Python multiprocessing pool.map for multiple arguments

In the Python multiprocessing library, is there a variant of pool.map which support multiple arguments?

text = "test"
def harvester(text, case):
X = case[0]
text+ str(X)

if __name__ == '__main__':
pool = multiprocessing.Pool(processes=6)
case = RAW_DATASET
pool.map(harvester(text,case),case, 1)
pool.close()
pool.join()

Answer

My initial thought was to use partial, and as J.F. Sebastian indicated, partial works in this instance in Python >=2.7, so I am posting this, with the caveat that it won't work in 2.6.

Also note that in the above code, you're passing the result of harvester(text, case) instead of the function harvester itself. Also, you aren't returning anything; you'll have to return something in order for this to be useful.

I'm assuming that text is the variable that should be mapped, while case supplies the mapping function with extra information about the whole sequence. This simply maps each element in case to case[i] + case[0]. That's a bit different from what you did, but I find this example clearer:

from functools import partial

def harvester(text, case):
    X = case[0]
    return text + str(X)

partial_harvester = partial(harvester, case=RAW_DATASET)

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=6)
    case_data = RAW_DATASET
    pool.map(partial_harvester, case_data, 1)
    pool.close()
    pool.join()

J.F. Sebastian's answer is more general because it allows you to specify unique arguments for every call. But using partial is simpler when one of the arguments stays the same for all calls.