Is Dynamic Linker (aka Program Interpreter,Link Loader) part of Kernel or GCC Library ?
In an ELF executable, this is referred to as the "ELF interpreter". On linux (e.g.) this is
This is not part of the kernel and [generally] with
glibc et. al.
When the kernel executes an ELF executable, it must map the executable into userspace memory. It then looks inside for a special sub-section known as
INTERP [which contains a string that is the full path].
The kernel then maps the interpreter into userspace memory and transfers control to it. Then, the interpreter does the necessary linking/loading and starts the program.
ELF stands for "extensible linker format", this allows many different sub-sections with the ELF file.
Rather than burdening the kernel with having to know about all the myriad of extensions, the ELF interpreter that is paired with the file knows.
Although usually only one format is used on a given system, there can be several different variants of ELF files on a system, each with its own ELF interpreter.
This would allow [say] a BSD ELF file to be run on a linux system [with other adjustments/support] because the ELF file would point to the BSD ELF interpreter rather than the linux one.
every process(vlc player, chrome) had the shared library ld.so as part of their address space.
Yes. I assume you're looking at
/proc/<pid>/maps. These are mappings (e.g. like using
mmap) to the files. That is somewhat different than "loading", which can imply [symbol] linking.
So primarily loader after loading the executable(code & data) onto memory , It loads& maps dynamic linker (.so) to its address space
The best way to understand this is to rephrase what you just said:
So primarily the kernel after mapping the executable(code & data) onto memory, the kernel maps dynamic linker (.so) to the program address space
That is essentially correct. The kernel also maps other things, such as the
bss segment and the stack. It then "pushes"
envp [the space for environment variables] onto the stack.
Then, having determined the start address of
ld.so [by reading a special section of the file], it sets that as the resume address and starts the thread.
Up until now, it has been the kernel doing things. The kernel does little to no symbol linking.
ld.so takes over ...
which further Loads shared Libraries , map & resolve references to libraries. It then calls entry function (_start)
Because the original executable (e.g.
vlc) has been mapped into memory,
ld.so can examine it for the list of shared libraries that it needs. It maps these into memory, but does not necessarily link the symbols right away.
Mapping is easy and quick--just an
The start address of the executable [not to be confused with the start address of
ld.so], is taken from a special section of the ELF executable. Although, the symbol associated with this start address has been traditionally called
_start, it could actually be named anything (e.g.
__my_start) as it is what is in the section data that determines the start address and not address of the symbol
Linking symbol references to symbol definitions is a time consuming process. So, this is deferred until the symbol is actually used. That is, if a program has references to
printf, the linker doesn't actually try to link in
printf until the first time the program actually calls
This is sometimes called "link-on-demand" or "on-demand-linking". See my answer here: Which segments are affected by a copy-on-write? for a more detailed explanation of that and what actually happens when an executable is mapped into userspace.
If you're interested, you could do
ldd /usr/bin/vlc to get a list of the shared libraries it uses. If you looked at the output of
readelf -a /usr/bin/vlc, you'll see these same shared libraries. Also, you'd get the full path of the ELF interpreter and could do
readelf -a <full_path_to_interpreter> and note some of the differences. You could repeat the process for any
.so files that
Combining all that with
/proc/<pid>maps et. al. might help with your understanding.