Moving beyond fork() + exec()

Posted by jwilk 4 hours ago

148 points | 114 commentspage 2

mike_hock 2 hours ago|

The most astonishing part is that this is dated June 5th, 2026.

I.e. a year that starts with 20, not 19.

JdeBP 1 hour ago|

These discussions were definitely had back in the 20th century too. The spawn model versus the fork+execve model has been an on-going debate since the time of MS/PC/DR-DOS.

debatem1 3 hours ago||

There are a lot of slightly different fork-exec-like things in the concept space and it's hard to imagine one approach satisfying them all. IMO it would be interesting to take an approach analogous-ish to sched_ext_ops where you built the rough flow chart of a combined fork-exec, but with hooks built to enable ebpf to change behavior or skip the bits these sophisticated users don't want/need.

MBCook 2 hours ago|

Fork/exec is great if you actually want the traditional copy of your process for some reason.

For launching something totally new, like the example in the article of some tool calling git, I think it does make a ton of sense to make something new.

Especially since I suspect that is by far the more common case. I suspect “I want a clone of me“ is relatively rarely used at this point.

Sophira 3 hours ago||

I'm guessing that a big part of the problem with moving away from fork() in general is that each new process needs a copy of the parent process' environment anyway, right?

zerobees 3 hours ago||

The LWN article is incorrect in saying that it "must copy the entire process state (including memory) for the child process". There are some kernel structures and page tables that need to be initialized, plus you need a new stack, but it's not nearly as dramatic as implied. Most of the parent's memory is "incorporated by reference", so to speak.

In fact, if you profile it, in the fork() + execve() model, execve() is far more expensive, because not only does it replace the old process with a new one, but it also involves running the dynamic linker, which opens, parses, and mmaps library files.

It still makes sense to get rid of the fork() overhead if you're going to throw away the cloned process state soon thereafter, but if you wanted to make process execution radically faster, rethinking the exec architecture would probably offer more significant gains.

corbet 2 hours ago|||

The kernel does not copy every page, but it does have to copy all of the VMAs. Setting memory to COW (which can involve changing a lot of page-table-entries) is not free either. I guess I could have mentioned copy-on-write explicitly, but I do not believe that what I wrote was incorrect.

nasretdinov 2 hours ago|||

Fork becomes more and more expensive the higher the RSS of the process, roughly 1ms per 1Gb of the process size with 4kb pages. Given that modern servers can easily support 1-2Tb of RAM the fork() part can easily take several hundred milliseconds, blocking everything in the meantime. So for larger programs you kinda have to have a "fork helper" process if you need to execute external programs for some reason.

dijit 3 hours ago|||

I'm a bit naive, but I don't think that's necessarily a requirement.

It might be commonly held convention, and thus, an assumption, in Linux (and, broadly, UNIX) but I don't think it's true inside VAX or even Windows, so I don't think it's a requirement.

Unless I've missed something (which is totally possible, this is not an area of OS design I've spent much time).

lanstin 3 hours ago|||

But also UID, groups, controlling TTY, process group, capabilities, pipes, shared memory, etc. and the file descriptors while maybe not inherently needed are how a lot of Unix plumbing works.

sjmulder 3 hours ago|||

Even DOS has environment inheritance!

sanderjd 3 hours ago|||

A lot of times you actively don't want the parent environment or any of the memory or file descriptors. And then you have to actively do work to fix all that stuff up after the fork.

lokar 3 hours ago||

the environment is not that big

lokar 3 hours ago||

This seems unnecessary to me. In the example, the core of git should be a library yo can link so you don't need to run the binary. That would be better in every way.

1718627440 2 hours ago||

But when you use a process, you get tons of things for free, the subtask is invoked in parallel, you get isolation and you can control execution for free. Unless you are already writing a multithreaded program or already accept passing objects in memory, using a process is actually easier to write than using a library.

If I use a library, I also need to start using threads and need to invent some core synchronization mechanism. I essentially are reinventing a small scheduler, when I already get this from the OS for free. Also know any crash in the third-party code will crash the whole program, the third-party code has access to the whole address space. With invoking a process you also have a standardized API implemented by the OS.

omoikane 2 hours ago|||

Launching git repeatedly was probably not the best example. But it's hard to think of good examples where launching processes repeatedly is the most performant thing to do, probably because launching processes had been expensive and everyone has learned to do something else (libraries, zygotes, etc). Maybe a different question is: if launching processes were cheap, is there something we would implement as processes instead of libraries?

I can recall just one program that's intentionally not implemented as a library, but I think people have since built a library on top of it:

https://dechifro.org/dcraw/#:~:text=Why%20don%27t%20you%20im...

sanderjd 3 hours ago||

There are lots of reasons to want to spawn fresh processes, which aren't solved by linking a library.

lokar 3 hours ago|||

Sure, but not many times a second

kllrnohj 2 hours ago||

Every build system ever says hello.

aerzen 3 hours ago|||

Spawning processes should not be on the hot path of any program.

1718627440 3 hours ago|||

Why? That's a very useful processing primitive.

lokar 2 hours ago||

It’s a hack with many disadvantages. Sometimes a hack is the right answer, but the kernel should it add a primitive for it.

MBCook 2 hours ago||

Should bash link in every program the user might want? Load them up as dynamic libraries?

pizlonator 2 hours ago|||

It ends up on the hot path of programs that use process isolation aggressively

hparadiz 3 hours ago||

Maybe tangentially related but I always think it's silly that every linux process has the same libgcc_so.so.1 loaded into memory for each process even though the raw binary for the library is exactly the same so you end up with like 800 copies of libgcc_so.so.1 in memory.

I mean maybe this has been optimized for already and I don't know what I'm talking about but maybe someone with more knowledge about the kernel knows? Is this something we simply can't optimize for because of security implications?

201984 3 hours ago||

Shared libraries (and mmapped files in general) are deduplicated; it's nowhere near as bad as you think. The kernel loads a .so into memory once and then maps that memory into every process that mmaps it.

Editing to add: this deduplication is one of the greatest upsides to dynamic linking. Common libs like libgcc and libc only have to exist in memory once and can stay in CPU caches, whereas if they were statically linked into every binary, each binary would have a copy of that library that wouldn't be shared with anything else and you'd waste a lot of memory.

sjmulder 3 hours ago||

Doesn't the loaded code have to be patched for relocations?

ptspts 3 hours ago|||

It does, so not 100% is reused. The patched parts are in different sections though, so the entire .text (code) section ends up being reused.

monocasa 3 hours ago||||

Not on modern archs that provide decent support for PIE (position independent executables).

201984 2 hours ago||

How do you think position independent code can call functions from other .so's without being patched with their addresses?

They can't, so even PIC code still has to have a relocation table that gets patched. It's in a different page than the code though, so code does still get reused.

monocasa 1 hour ago||

That's not really patching though, any more than any use of function pointers is patching.

201984 22 minutes ago||

There's a part of the .so ELF file (the Global Offset Table aka GOT) that has to be modified with all the addresses of the functions being imported, which of course vary from process to process.

If not patching, what exactly would you call modifying part of the file?

t-3 3 hours ago|||

Not if it's position-independent.

saidinesh5 3 hours ago|||

Typically libgcc_so.so is loaded by the linker, which uses an mmap call to map the binary into the address space.

> The kernel keeps track of which file is mapped where, and can detect when a request is made to map an already mapped file again, avoiding physical memory allocation if possible.

Relevant stack overflow answer: https://stackoverflow.com/questions/61950951/linux-shared-li...

mlaretallack 3 hours ago|||

In Linux, when a shared lib is loaded by multiple processes, its loaded once and not duplicated in ram. Only if a memory page is modified by the process will the memory be duplicated. (Hope I have explained that correctly)

monocasa 3 hours ago|||

Those mappings by default all go to the same shared memory.

Unices have been sharing executable memory between processes longer than there's been mmap for user space to do the same thing themselves. I remember seeing it in the 2BSD kernel for instance.

BoingBoomTschak 3 hours ago|||

Eh? Aren't shared libraries actually shared in memory?

1718627440 2 hours ago||

Yeah, that's kind of the point.

sirsinsalot 3 hours ago||

I have a rule for myself. If I think something is silly or stupid, I assume I don't understand it. I usually find I do not understand it, and it no longer seems silly when I do understand it.

In this case too, you think it is silly because you don't understand it. Your assumptions are wrong, making it seem silly.

burnt-resistor 3 hours ago|

> "If you are repeatedly creating large processes, you are already doing it wrong. The fix is in user space, not the kernel."

Every couple of years, someone claims they have "the solution" implying everyone else who came before them didn't know what they were doing.

yxhuvud 1 hour ago|

It can also mean that neither the hardware side or the software side is static, but change over time. That means that their demands and what they allow also change over time. This leads to the insight that what was perhaps a good idea on 70s hardware/software is not necessarily a good, or even ok, idea 50 years later on modern hardware executing OSes and programs that have been kept up to date.