sreeraag sreeraag - 5 days ago 5
Python Question

Will self object be shared by Python's multiprocessing.Process?

I have a init method which initializes various primitive and complex data types and objects. In each process spawned by multiprocessing.Process, I'm printing a variable from init() method and an address of an initialized object.
I get different instances of the variable but the address of object remains the same. So, want to know what exactly happens to members of parent class during multiprocessing.Process call?

def __init__(self):
self.count = 0
self.db = pymongo.MongoClient()

def consumerManager(self):
for i in range(4):
p = multiprocessing.Process(target = self.consumer, args = (i,))


def consumer(self, i):
while(1):
time.sleep(i)
self.count += 1
print self.count
print os.getpid()
print id(self.db)


If it is doing a deep copy of objects, then
id(self.db)
should be printing a different id within each process, which doesn't happen. How does this done?

Answer

Generally on Linux when a new process is created, a copy of the parent is generated.

At the beginning the two processes will be in the same state but with different address spaces.

To save time, Linux shares the memory of the parent with the child until both do not modify it. This is usually referred as Copy On Write.

As the two processes keep executing, their state will diverge. If you want them to share information you can use different mechanisms: Pipes, Shared memory, Managers and Queues.

Usually, due to their simplicity Pipes and Queues are the recommended ones.

The reason you see the same id is explained in the following question. As the new process has the same memory layout of the parent, in CPython the id will be the same.

Comments