Haiyang Haiyang - 26 days ago 7
Python Question

multithreading: Why aren't generators thread-safe? What happens when it is shared among threads?

I'm reading this question which asks if generators are thread-safe, and one answer said:


It's not thread-safe; simultaneous calls may interleave, and mess with
the local variables.


Another answer shows that you can use a lock to ensure that only one thread uses the generator at a time.

I'm new to multithreading. Can anyone devise an example to show what exactly happens when you use the generator without lock?

For example, it doesn't seem to have any problems if I do this:

import threading

def generator():
for i in data:
yield i

class CountThread(threading.Thread):
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name

def run(self):
for i in gen():
print '{0} {1}'.format(self.name, i)

data = [i for i in xrange(100)]
gen = generator()
a = CountThread('a')
b = CountThread('b')
a.start()
b.start()

Answer

Run this example.

You'll see that the 10 000 numbers will be "shared" across threads. You won't see the 10 000 numbers in both threads.

It's actually most likely that one thread will see all the numbers.

import threading

class CountThread(threading.Thread):
  def __init__(self, gen):
      threading.Thread.__init__(self)
      self.gen = gen
      self.numbers_seen = 0

  def run(self):
      for i in self.gen:
          self.numbers_seen += 1


def generator(data):
    for _ in data:
        yield data

gen = generator(xrange(10000))

a = CountThread(gen)
b = CountThread(gen)

a.start()
b.start()

a.join()
b.join()

print "Numbers seen in a", a.numbers_seen
print "Numbers seen in b", b.numbers_seen

Actually, if it happens that Python switches threads during execution (just use a higher value than 10000, e.g. 10000000), you'll get an exception:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "test.py", line 10, in run
    for i in self.gen:
ValueError: generator already executing