← Back to context

Comment by cryptonector

4 years ago

If the parent is threaded and the host has more than one CPU, then fork() == TLB shootdowns, which are slow.

As well there's the cost of all those page faults that the two processes are likely to take to do the copying.

And lastly there's all sorts of complexity involving multiple parent threads calling fork(), or the child calling fork() again (or vfork()) before calling exec.

It's just much easier to copy the resident set and mark the address space as being CoW, because now you only have to worry about page faults for pages that are not in core anyways and so were going to fault anyways, and that means you don't have to worry about TLB shootdowns either (if a page is not in core, it's not referenced by any TLB either). You still have the multi-fork issues, but now you can use an atomic reference count on the address space.