How is context switch made in linux kernel when process exits before timer interrupt?
I know that if the process is running and timer interrupt occurs then
It's important to understand that the timer interrupt is just one of several hundred different reasons why
schedule might get called. Only programs whose runtime is dominated by computation, which is rarer than you'd think, ever exhaust their time slice. It's much more common for programs to run for only a few microseconds — yes, microseconds — at a time, in between "blocking" in system calls, waiting for user input or whatever.
When a process exits in any way, ultimately a call to
do_exit always happens, in the (kernel) context of that process.
schedule as its last action, and
schedule never returns to that context. Note how, at the very end of
do_exit, there is a call to
schedule, followed immediately by
BUG(); and an infinite loop.
Just prior to this,
exit_notify, which is responsible for sending
SIGCHLD to the parent process and/or releasing it from a call to
wait. So, a lot of the time, the parent process will be ready-to-run when
schedule gets called, and will be selected.
do_exit also deallocates all of the user-space state and much of the kernel state associated with the process, frees memory, closes file descriptors, etc. The
task_struct itself must survive until someone calls
wait, and I can't figure out exactly how the kernel decides that it can now be deallocated; this code is too convoluted.
If the process called
_exit, the kernel call chain is simply
do_exit. If it took a fatal synchronous signal (e.g.
SIGSEGV), the call chain is a lot longer and has a tricky diversion in it. The hardware trap is fielded by architecture-specific code (e.g. x86
complete_signal, which adjusts the task state and then tells the scheduler to wake up the offending process. The offending process wakes up, and a maze of architecture-specific signal handling logic eventually delivers it to
get_signal, which calls
do_group_exit, which calls
do_exit. Fatal asynchronous signals (e.g. from typing
kill 12345 at a shell prompt) start at
sys_kill and go through
send_signal, after which everything proceeds as above. In both cases, all of the steps up to
complete_signal may happen in any process context, but everything after "The offending process wakes up" happens in that process's context.
The only parts of this description that are Linux-specific are the names of functions in the kernel's code. Any implementation of Unix will have kernel functions that do more or less what Linux's
schedule do, and the sequences of operations involved in fielding
_exit, fatal synchronous signals, and fatal async signals will be recognizably similar.