Comment by evilotto

7 years ago

Not even just the semantics, the performance is awful. Even when the fork is virtual (as any modern fork is) and there's no memory copying because it's COW, all the kernel page tables still need to be copied and for a multi-GB process that's nontrivial. That's why any sane large service that needs to fork anything will early on start up a slave subprocess whose only job is to fork quickly when the master process needs it.

>all the kernel page tables still need to be copied and for a multi-GB process that's nontrivial

Only in the pathological case where the large process is backed solely by the 4kb pages. The hardware has long now supported large pages - on x86 since Pentium Pro, if memory serves - and huge pages. The popular OSes (Linux 2.6+ and Windows 2003+) also do support large and huge pages. A 2GB process can easily be three pages: r/x code, r/w stack, r/w data (2gb). Granted, it gets a bit more complex if mmapped I/O or JIT are used, but since both are mature technology now, it's fine to point fingers at any inefficiency and demand better. Another caveat would probably be shared libraries loading at separate address ranges, which, IMO, is another reason to ditch shared libraries for good.

Contrary to popular wisdom, OS research is still relevant.

  • You want to ditch shared libraries and mmap to map your big processes using GB pages to make fork fast again (despite it not being the main and only drawback)???

    OS research might be relevant, and it's good that some people have wild idea, but honestly I doubt this one will go anywhere :P

    • Ah sorry, only want to ditch the shared libraries; added mmap is a later edition didn't realize it's unclear. Of course mmap is necessary.

      3 replies →

  • > Contrary to popular wisdom, OS research is still relevant.

    Is it really popular wisdom though, or is it the opinion of one person and it got hyped up, much like the same hype happened on a subpar programming language that same person worked on?

> That's why any sane large service that needs to fork anything will early on start up a slave subprocess whose only job is to fork quickly when the master process needs it.

I don't think that's (entirely) true. This is more because a large service with some potent master process will have said process Do Stuff(tm) that will involve opening files, threads, signal handling, or whatever things that need to be taken care of one way or the other when forking to a worker (or whatever other child) process. It's therefore much simpler to fork a master subprocess into a child spawner earlier on, when it has yet to do anything. You significantly reduce your chances of screwing up if you have nothing to clean up for.

  • That's true, it's not the only reason. Dealing with threads and buffers and pthread_atfork and the associated heartbreak is a biggie also. But the performance is nothing to laugh at.

    I just did a quick test, a 100mb process generally takes >2ms to fork, while a 1mb or less process takes 70us. It seems like its pretty much linear with process size.

The performance is awful, but in return you get the COW memory you mentioned. That's a pretty huge benefit for a lot of programs with huge, seldom-changing memory state at startup. If those programs want to parallelize themselves without duplicating that memory or paying startup time/CPU overhead, fork() is a pretty handy way to achieve that.

These days, you can usually start a process without forking through posix_spawn/vfork. Although, I gather some servers still do it so they can set the current working directory more easily.