Channel72 Channel72 - 27 days ago 7
Python Question

PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseum)

So, basically there seems to be massive confusion/ambiguity over when exactly

is supposed to be called, and what accompanying API calls are needed. The official Python documentation is unfortunately very ambiguous. There are already many questions on stackoverflow regarding this topic, and indeed, I've personally already asked a question almost identical to this one, so I won't be particularly surprised if this is closed as a duplicate; but consider that there seems to be no definitive answer to this question. (Sadly, I don't have Guido Van Rossum on speed-dial.)

Firstly, let's define the scope of the question here: what do I want to do? Well... I want to write a Python extension module in C that will:


  1. Spawn worker threads using the
    pthread
    API in C

  2. Invoke Python callbacks from within these C threads



Okay, so let's start with the Python docs themselves. The Python 3.2 docs say:


void PyEval_InitThreads()

Initialize and acquire the global interpreter lock. It should be
called in the main thread before creating a second thread or engaging
in any other thread operations such as PyEval_ReleaseThread(tstate).
It is not needed before calling PyEval_SaveThread() or
PyEval_RestoreThread().


So my understanding here is that:


  1. Any C extension module which spawns threads must call
    PyEval_InitThreads()
    from the main thread before any other threads
    are spawned

  2. Calling
    PyEval_InitThreads
    locks the GIL



So common sense would tell us that any C extension module which creates threads must call
PyEval_InitThreads()
, and then release the Global Interpreter Lock. Okay, seems straightforward enough. So prima facie, all that's required would be the following code:

PyEval_InitThreads(); /* initialize threading and acquire GIL */
PyEval_ReleaseLock(); /* Release GIL */


Seems easy enough... but unfortunately, the Python 3.2 docs also say that
PyEval_ReleaseLock
has been deprecated
. Instead, we're supposed to use
PyEval_SaveThread
in order to release the GIL:


PyThreadState* PyEval_SaveThread()

Release the global interpreter lock (if it has been created and thread
support is enabled) and reset the thread state to NULL, returning the
previous thread state (which is not NULL). If the lock has been
created, the current thread must have acquired it.


Er... okay, so I guess a C extension module needs to say:

PyEval_InitThreads();
PyThreadState* st = PyEval_SaveThread();




Indeed, this is exactly what this stackoverflow answer says. Except when I actually try this in practice, the Python interpreter immediately seg-faults when I import the extension module.    Nice.








Okay, so now I'm giving up on the official Python documentation and turning to Google. So, this random blog claims all you need to do from an extension module is to call
PyEval_InitThreads()
. Of course, the documentation claims that
PyEval_InitThreads()
acquires the GIL, and indeed, a quick inspection of the source code for
PyEval_InitThreads()
in
ceval.c
reveals that it does indeed call the internal function
take_gil(PyThreadState_GET());


So
PyEval_InitThreads()
definitely acquires the GIL. I would think then that you would absolutely need to somehow release the GIL after calling
PyEval_InitThreads()
.   But how?
PyEval_ReleaseLock()
is deprecated, and
PyEval_SaveThread()
just inexplicably seg-faults.

Okay... so maybe for some reason which is currently beyond my understanding, a C extension module doesn't need to release the GIL. I tried that... and, as expected, as soon as another thread attempts to acquire the GIL (using PyGILState_Ensure), the program hangs from a deadlock. So yeah... you really do need to release the GIL after calling
PyEval_InitThreads()
.

So again, the question is: how do you release the GIL after calling
PyEval_InitThreads()
?


And more generally: what exactly does a C-extension module have to do to be able to safely invoke Python code from worker C-threads?

Answer

The short answer: you shouldn't care about releasing the GIL after calling PyEval_InitThreads, it will never again be released, except temporarily.

In Python threading, someone always holds the GIL. It can relinquish it temporarily with PyEval_SaveThread, but it will get it back at PyEval_RestoreThread (the same applies Py_{BEGIN,END}_ALLOW_THREADS macros). Since someone always must hold the GIL, it only makes sense for the call that materializes the lock into the system to also acquire it.

Therefore your C extension should simply call PyEval_InitThreads in its init function. The current thread will have the GIL until it relinquishes it to someone elseā€”this is how Python threading works. When your worker C threads need to invoke Python (or call anything from the Python API, including Py_INCREF), just wrap the code in the usual pair of PyGILState_Ensure and PyGILState_Release. Since the GIL is recursive, this will work even if the current thread is already holding it.