x34c4 x34c4 - 3 months ago 5
Python Question

Dividing a for loop for each process

I have this code:

def loop():
alphabet = string.digits + string.letters
for key in itertools.product(alphabet, repeat=6):

I am using 4 processes using this code:

if __name__ == '__main__':
jobs = []
for i in range(4):
p = multiprocessing.Process(target=loop)

Now.. this will just run the entire function 4 times, I need to somehow split the workload into 4 and run each process on its own, so in this case I need to split the characters I'm generating into 4 different parts.. for example:

Process 1 workload


Process 2 workload


Process 3 workload


Process 4 workload


I think you should understand what I want to do..

I tried looping through and just throwing away but it can get super slow when using a large length of characters.. If I had 1,000,000 lines and the processor name was 4, it will loop 750,000 times without doing anything and process the next 250,000, if the processor name was 3.. it would loop 500,000 times, process the next 250k and finish at 75000, so much wasted computing power though :/


You need to divide the workload beforehand and pass it in to your function when you call Process. Generally speaking, this can be a hard problem, but in your case it's pretty trivial since you're just generating cartesian products -- simply slice off the first character and attach it separately.

i.e. instead of generating repeat=6, use repeat=5 and iterate through the possibilities for the first letter yourself, passing each to a separate process.

For example:

def loop(first, sequence):
    for seq in sequence:
        key = first + seq

and call it with:

alphabet = ...
for letter in alphabet:
    p = Process(target=loop, args=(letter, itertools.product(alphabet, repeat=5))
    # etc.

This will spawn one process per letter in your alphabet; you could do exactly four splits or other things like that by passing ranges for the first character, too.