Dean Leitersdorf Dean Leitersdorf - 2 months ago 14
C Question

Time waste of execv() and fork()

I am currently learning about fork() and execv() and I had a question regarding the efficiency of the combination.

I was shown the following standard code:

pid = fork();
if(pid < 0){
//handle fork error
}
else if (pid == 0){
execv("son_prog", argv_son);
//do father code


I know that fork() clones the entire process (copying the entire heap, etc) and that execv() replaces the current address space with that of the new program. With this in mind, doesn't it make it very inefficient to use this combination? We are copying the entire address space of a process and then immediately overwrite it.

So my question: What is the advantage that is achieved by using this combo (instead of some other solution) that makes people still use this even though we have waste?

Answer

What is the advantage that is achieved by using this combo (instead of some other solution) that makes people still use this even though we have waste?

You have to create a new process somehow. There are very few ways for a userspace program to accomplish that. POSIX used to have vfork() alognside fork(), and some systems may have their own mechanisms, such as Linux-specific clone(), but since 2008, POSIX specifies only fork() and the posix_spawn() family. The fork + exec route is more traditional, is well understood, and has few drawbacks. The posix_spawn family is designed as a special purpose substitute for use in contexts that present difficulties for fork(); you can find details in the "Rationale" section of its specification.

This excerpt from the Linux man page for vfork() may be illuminating:

Under Linux, fork(2) is implemented using copy-on-write pages, so the only penalty incurred by fork(2) is the time and memory required to duplicate the parent’s page tables, and to create a unique task structure for the child. However, in the bad old days a fork(2) would require making a complete copy of the caller’s data space, often needlessly, since usually immediately afterwards an exec(3) is done. Thus, for greater efficiency, BSD introduced the vfork() system call, which did not fully copy the address space of the parent process, but borrowed the parent’s memory and thread of control until a call to execve(2) or an exit occurred. The parent process was suspended while the child was using its resources. The use of vfork() was tricky: for example, not modifying data in the parent process depended on knowing which variables are held in a register.

Thus, your concern about waste is not well-founded for modern systems, but it was indeed an issue historically, and there were indeed mechanisms designed to avoid it. These days, most of those mechanisms are obsolete.