niosus niosus - 6 months ago 34
Python Question

sublime text 3 plugin host crash recovery

I develop a plugin for Sublime Text 3 and my python code uses c type bindings to clang. Sometimes calling libclang would segfault with

libclang: crash detected during reparsing
(I don't understand the reason yet, but it is irrelevant to this question). This then leads to crashing plugin host.

So the question is: is there any way in python to recover from a failure in the underlying c binding? I would gladly just skip this action on this particular file where I experience the crash.

Thanks!

UPD: There was a short discussion in comments and it makes sense to elaborate further on the lack of a proper small reproducible example. It is not vecause of my laziness, I do try to make it as easy as possible to understand the issue for the people I expect help from. But in this case it is really hard. The original issue is caused by libclang segfaulting in some strange situation which I haven't nailed down yet. It probably has something to do with one library being compiled with no c++11 support and the other one using it while being compiled with c++11 support, but I want to emphasize - this is irrelevant to the question. The issue here is that there is a segfault in something that python is calling and this segfault causes Sublime Text plugin_host to exit. So there is simple example here, but not for the lack of trying. I am also open to suggestions if you have ideas how to construct one. And sorry for the poor quality of this question, this is currently my best.

Answer

Working with the detail that I have, I'm reasonably sure your question boils down to "can Python handle errors that occurred when using the foreign function interface."

I'm pretty sure that the answer is "no", and I put together the following test scenario to explain why:

Here's our test C++ module (with a bit of C for name-mangling purposes) that will blow up in our face, test.cc :

#include <iostream>
#include <signal.h>

class Test{
    public:
        void test(){
            std::cout << "stackoverflow" << std::endl;
            // this will crash us. shouldn't really matter what SIG as long as it crashes Python
            raise (SIGABRT);
        }
};


extern "C" {
    Test* Test_new(){ return new Test(); }
    void Test_example(Test* test){ test->test(); }
}

clang -shared -undefined dynamic_lookup -o test.so test.cc

And our calling script, test.py:

from ctypes import cdll

test_so = cdll.LoadLibrary("test.so")

class PyTest:
    def __init__(self):
        self.obj = test_so.Test_new()

    def output(self):
        test_so.Test_example(self.obj)

if __name__ == "__main__":
    p = PyTest()
    p.output()

Call it:

Ξ /tmp/29_may → python test.py
stackoverflow
[1]    55992 abort      python test.py

This crashes Python as expected and generates a nice "report error" detail on OS X:

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff95bf48ea __kill + 10
1   test.so                         0x0000000110285006 Test::test() + 70
2   test.so                         0x0000000110284fb5 Test_example + 21
3   _ctypes.so                      0x000000011026d7c7 ffi_call_unix64 + 79
4   _ctypes.so                      0x000000011026dfe6 ffi_call + 818
5   _ctypes.so                      0x000000011026970b _ctypes_callproc + 867
6   _ctypes.so                      0x0000000110263b91 PyCFuncPtr_call + 1100
7   org.python.python               0x000000010fd18ad7 PyObject_Call + 99
8   org.python.python               0x000000010fd94e7f PyEval_EvalFrameEx + 11417
9   org.python.python               0x000000010fd986d1 fast_function + 262
10  org.python.python               0x000000010fd95553 PyEval_EvalFrameEx + 13165
11  org.python.python               0x000000010fd91fb4 PyEval_EvalCodeEx + 1387
12  org.python.python               0x000000010fd91a43 PyEval_EvalCode + 54
13  org.python.python               0x000000010fdb1816 run_mod + 53
14  org.python.python               0x000000010fdb18b9 PyRun_FileExFlags + 133
15  org.python.python               0x000000010fdb13f9 PyRun_SimpleFileExFlags + 711
16  org.python.python               0x000000010fdc2e09 Py_Main + 3057
17  libdyld.dylib                   0x00007fff926d15ad start + 1

I copy and pasted this because it's cleaner/easier to parse than an strace (also, I'm lazy ;). The call to __kill is where we crashed; we never see a return to Python, which means it's out of our control.

To prove this, modify our test.py into test_handle_exception.py to try to catch the exception:

from ctypes import cdll

test_so = cdll.LoadLibrary("test.so")

class PyTest:
    def __init__(self):
        self.obj = test_so.Test_new()

    def output(self):
        test_so.Test_example(self.obj)

if __name__ == "__main__":
    p = PyTest()

    try:
        p.output()
    except:
        print("If you're reading this, we survived somehow.")

And running it again:

Ξ /tmp/29_may → python test_handle_exception.py
stackoverflow
[1]    56297 abort      python test_handle_exception.py

Unfortunately, and as far as I know, we cannot catch the exception/crash at the Python layer because it happened "beneath" the control of bytecode. A non-specific Exception clause will try to catch any exception that occurs, where the following statement is the action taken when an exception gets caught. If you're reading this, we survived somehow. was never sent to stdout, and we crashed, which means Python doesn't get a chance to react.

If you can, handle this exception in your C++ code. You may be able to get creative and use multiprocessing to fork into a process that can crash without taking down your main process, but I doubt it.

Comments