flybonzai flybonzai - 13 days ago 5
Python Question

How to collect function return values while multithreading without using globals?

So I'm trying to work out a generic solution that will collect all values from a function and append them to a list that is later accessible. This is to be used during

concurrent.futures
or
threading
type tasks. Here is a solution I have using a global
master_list
:

from concurrent.futures import ThreadPoolExecutor

master_list = []
def return_from_multithreaded(func):
# master_list = []
def wrapper(*args, **kwargs):
# nonlocal master_list
global master_list
master_list += func(*args, **kwargs)
return wrapper


@return_from_multithreaded
def f(n):
return [n]


with ThreadPoolExecutor(max_workers=20) as exec:
exec.map(f, range(1, 100))

print(master_list)


I would like to find a solution that does not include globals, and perhaps can return the commented out
master_list
that is stored as a closure?

Answer

If you don't want to use globals, don't discard the results of map. map is giving you back the values returned by each function, you just ignored them. This code could be made much simpler by using map for its intended purpose:

def f(n):
    return n  # No need to wrap in list

with ThreadPoolExecutor(max_workers=20) as exec:
    master_list = list(exec.map(f, range(1, 100)))

print(master_list)

If you need a master_list that shows the results computed so far (maybe some other thread is watching it), you just make the loop explicit:

def f(n):
    return n  # No need to wrap in list

master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
    for result in exec.map(f, range(1, 100)):
        master_list.append(result)

print(master_list)

This is what the Executor model is designed for; normal threads aren't intended to return values, but Executors provided a channel for returning values under the covers so you don't have to manage it yourself. Internally, this is using Queues of some form or another, with additional metadata to keep the results in order, but you don't need to deal with that complexity; from your perspective, it's equivalent to the regular map function, it just happens to parallelize the work.

Comments