maplaz maplaz - 5 months ago 21
Linux Question

How does dynamic linker changes text segment of process?

If i understand correctly when user tries to execute dynamically linked
executable (with

execve("foo", "", "")
) instead of loading text segment of "foo" dynamic linker is loaded (ld-linux.so.2) and executed. It have to load
libraries required for program ("foo") to run and change some addresses
in "foo" and pass control to foo, but how is this acomplished?

How (what system call it uses) and where
does dynamic loader load libraries and "foo"s code and data in memory (I am guessing it can't simply use
malloc or mmap and then jump to code since that should be impossible,
right? It also seems unlikely that it creates temp file whith complete
executable (like staticlly linked one) and calls exceve again.).

Answer

The actual implementation is quite complex as it builds on top of ELF, which is quite complex as it tries to accommodate many scenarios, but conceptually it's quite simple.

Basically (after the library dependencies are located and opened) it's a couple of mmaps, mprotects, some modifications to implement the linking by binding symbols (can be deferred), and then jump to code.

Ideally, the linked shared libraries will be compiled with -fpic/-fPIC which will allow the linker to place them anywhere in the processes address space without having to write to the text section of the library. Such a library will call fucntions from other libraries via a modifiable table, which the linker will fix up (probably lazily) to point to the actual locations where it has loaded the dependent library. Access to variables from one shared library to another is similarly indirected.

Limiting the modifying library data/code as much as possible allows marking sections of code to be marked read only (via the MMU / the mprotect system call) and mapped into memory that's shared among all processes that use that particular library.


To get an idea of what happens at the syscall level, you can try e.g.:

strace /bin/echo hello world

and all the syscalls up to about sbrk included (=setting up the heap / .data segment) should be the doings of the dynamic linker.


(malloc is indeed unavailable to the linker as malloc is a feature of the c library, not the system. malloc is about growing and managing the heap section and potentially mmapping other separate blocks and managing those as well as the writable "heap", and the dynamic linker isn't concerned about these sections of a process image, mainly just its writable indirection tables and where it maps libraries).