Mark Mark - 1 month ago 14
C Question

Operating system kernel and processes in main memory

Continuing my endeavors in OS development research, I have constructed an almost complete picture in my head. One thing still eludes me.

Here is the basic boot process, from my understanding:

1) BIOS/Bootloader perform necessary checks, initialize everything.

2) The kernel is loaded into RAM.

3) Kernel performs its initializations and starts scheduling tasks.

4) When a task is loaded, it is given a virtual address space in which it resides. Including the .text, .data, .bss, the heap and stack. This task "maintains" its own stack pointer, pointing to its own "virtual" stack.

5) Context switches merely push the register file (all CPU registers), the stack pointer and program counter into some kernel data structure and load another set belonging to another process.

In this abstraction, the kernel is a "mother" process inside of which all other processes are hosted. I tried to convey my best understanding in the following diagram:

enter image description here

Question is, first is this simple model correct?

Second, how is the executable program made aware of its virtual stack? Is it the OS job to calculate the virtual stack pointer and place it in the relevant CPU register? Is the rest of the stack bookkeeping done by CPU pop and push commands?

Does the kernel itself have its own main stack and heap?

Thanks.

Answer

Question is, first is this simple model correct?

Your model is extremely simplified but essentially correct - note that the last two parts of your model aren't really considered to be part of the boot process, and the kernel isn't a process. It can be useful to visualize it as one, but it doesn't fit the definition of a process and it doesn't behave like one.

Second, how is the executable program made aware of its virtual stack? Is it the OS job to calculate the virtual stack pointer and place it in the relevant CPU register? Is the rest of the stack bookkeeping done by CPU pop and push commands?

An executable C program doesn't have to be "aware of its virtual stack." When a C program is compiled into an executable, local variables are usually referenced in relative to the stack pointer - for example, [ebp - 4].

When Linux loads a new program for execution, it uses the start_thread macro (which is called from load_elf_binary) to initialize the CPU's registers. The macro contains the following line:

regs->esp = new_esp;   

which will initialize the CPU's stack pointer register to the virtual address that the OS has assigned to the thread's stack.

As you said, once the stack pointer is loaded, assembly commands such as pop and push will change its value. The operating system is responsible for making sure that there are physical pages that correspond to the virtual stack addresses - in programs that use a lot of stack memory, the number of physical pages will grow as the program continues its execution. There is a limit for each process that you can find by using the ulimit -a command (on my machine the maximum stack size is 8MB, or 2KB pages).

Does the kernel itself have its own main stack and heap?

This is where visualizing the kernel as a process can become confusing. First of all, threads in Linux have a user stack and a kernel stack. They're essentially the same, differing only in protections and location (kernel stack is used when executing in Kernel Mode, and user stack when executing in User Mode).

The kernel itself does not have its own stack. Kernel code is always executed in the context of some thread, and each thread has its own fixed-size (usually 8KB) kernel stack. When a thread moves from User Mode to Kernel Mode, the CPU's stack pointer is updated accordingly. So when kernel code uses local variables, they are stored on the kernel stack of the thread in which they are executing.

During system startup, the start_kernel function initializes the kernel init thread, which will then create other kernel threads and begin initializing user programs. So after system startup the CPU's stack pointer will be initialized to point to init's kernel stack.

As far as the heap goes, you can dynamically allocate memory in the kernel using kmalloc, which will try to find a free page in memory - its internal implementation uses get_zeroed_page.

Comments