Adam S. Adam S. - 1 year ago 88
Python Question

Possible causes of CUDA get device properties error with Python3 / Theano?

I'm trying to use multiple GPUs with multiprocessing in Python3. I can run a simple test case, like the following:

import theano
import theano.tensor as T
import multiprocessing as mp
import time
# import lasagne

def target():
import theano.sandbox.cuda
print("target about to use")
print("target is using")
import lasagne
print("target is exiting")

x = T.scalar('x', dtype='float32')

p = mp.Process(target=target)


import theano.sandbox.cuda
print("master about to use")
print("master is using")
import lasagne
print("master will join")

print("master is exiting")

When I run this, I get the master and the spawned process each using a GPU successfully:

>> target about to use
>> master about to use
>> Using gpu device 1: GeForce GTX 1080 (CNMeM is enabled with initial size: 50.0% of memory, cuDNN 5105)
>> target is using
>> Using gpu device 0: GeForce GTX 1080 (CNMeM is enabled with initial size: 50.0% of memory, cuDNN 5105)
>> master is using
>> master will join
>> target is exiting
>> master is exiting

But in a more complex code-base, when I try to set up the same scheme, the spawned worker fails with:

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device 1 failed:
Unable to get properties of gpu 1: initialization error
ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
Not able to select available GPU from 2 cards (initialization error).

And I'm having a hard time chasing down what's causing this. In the code snippet above, the problem is recreated if
is imported at the top, before forking. But I've managed to prevent my code from importing
until after forking and trying to use a GPU (I checked
), and still the problem persists. I don't see anything Theano related except for
itself and
being imported before forking, but in the example above that's fine.

Has anyone else chased down anything similar?

Answer Source

OK this turned out to be very simple... I had a stray import theano.sandbox.cuda in a pre-fork location, but this needs to happen only after forking. It was still necessary to also move lasagne imports to after the fork, in case that helps anyone else.

(In my case, I actually need information from lasagne-based code before the fork, so I have to spawn a throw-away process which loads that and gives the relevant values back to the master thread. The master can then build shared objects accordingly, fork, and subsequently each process builds its own lasagne-based objects which work on its own GPU.)