BurntSushi5 BurntSushi5 - 1 month ago 15
C Question

libsigsegv and responding to a stack overflow

We are attempting to test student code, and in an effort to automate the process, we'd like to detect if a student's code overflows the stack.

I've met with some success using the libsigsegv library and its corresponding stackoverflow_install_handler. It works brilliantly, until the student's code blows the stack twice.

For example, here's some sample output:

[# ~]$ ledit ./interpreter
-> (use solution)
-> (fun 1 2)

*** Stack overflow detected ***
-> (fun 1 2)
Signal -10
[# ~]


The initial "* Stack overflow detected *" is the desirable output. After blowing the stack for the second time, all I get is an unhelpful "Signal -10" and the program stops execution. I'd like to see the stack overflow detected message again, and let the code continue execution.

In my stack overflow handler, I'm just printing the overflow detection message to stderr and long jumping back to an "awaiting input state" in the interpreter.

Thanks for any help!

EDIT

As per caf's suggestion below, we've added a call to sigsegv_leave_handler() like so:

static void continuation(void *arg1, void *arg2, void *arg3) {
(void)(arg1);
(void)(arg2);
(void)(arg3);
siglongjmp(errorjmp, 1);
}

static void handler(int emergency, stackoverflow_context_t context) {
(void)emergency;
(void)context;
fprintf(stderr, "\n*** Stack overflow detected ***\n");
fflush(stderr);
sigsegv_leave_handler(continuation, NULL, NULL, NULL);
}


However, the output is still the same.

Answer

Simply longjmping away from a stack overflow isn't necessarily enough. I haven't seen the source code for the interpreter you're embedding this into, but my hunch is that the stack overflow leaves some internal interpreter state corrupted that may result in another crash. In particular, note that the signal you're getting is SIGBUS (10), not SIGSEGV (11).

Imagine the following scenario: You're just short of a stack overflow when the interpreter calls malloc. Malloc alters some internal data, then calls a helper function. A stack overflow occurs, and you longjmp back to the interpreter main loop. Your malloc pool is now corrupted, and there's nothing you can do about it.

I would recommend terminating and restarting the interpreter when the stack overflow is detected. Alternately, figure out exactly how interpreter state is getting corrupted, and arrange for it to be less of a problem (this can be quite hard!). You could also use explicit stack depth checking in the interpreter rather than trapping SIGSEGV; this would allow you to handle the error at a safe point, before SIGSEGV forces the issue.