Manohar Manohar - 5 months ago 136
Linux Question

handling SIGCHLD

In a system running Linux 2.6.35+ my program creates many child processes and monitors them. If a child process dies I do some clean-up and spawn the process again. I use signalfd() to get the SIGCHLD signal in my process. signalfd is used asynchronously using libevent.

When using signal handlers for non-real time signals, while the signal handler is running for a particular signal further occurrence of the same signal has to be blocked to avoid getting into recursive handlers. If multiple signals arrive at that time then kernel invokes the handler only once (when the signal is unblocked).

Is it the same behavior when using signalfd() as well ? Since signalfd based handling doesn't have the typical problems associated with the asynchronous execution of the normal signal handlers I was thinking kernel can queue all the further occurrences of SIGCHLD ? Can anyone clarify the Linux behavior in this case..

Answer

On Linux, multiple children terminating before you read a SIGCHLD with signalfd() will be compressed into a single SIGCHLD. This means that when you read the SIGCHLD signal, you have to clean up after all children that have terminated:

// Do this after you've read() a SIGCHLD from the signalfd file descriptor:
while (1) {
    int status;
    pid_t pid = waitpid(-1, &status, WNOHANG);
    if (pid <= 0) {
        break;
    }
    // something happened with child 'pid', do something about it...
    // Details are in 'status', see waitpid() manpage
}

I should note that I have in fact seen this signal compression when two child processed terminated at the same time. If I did only a single waitpid(), one of the children that terminated was not handled; and the above loop fixed it.

Corresponding documentation:

Comments