← Back to context

Comment by jph

4 years ago

Is it a fair point to implement first with fork() because of memory protection, then optimize by using benchmarks and potentially vfork() for speed? Benchmark areas can look at synchronous locks, copy-on-write memory, stack sharing, etc.

What are the good practices of security tradeoffs of fork() vs. vfork() especially in terms of ease of writing correct code? I'd thought that fork() + exec() tends to favor thinking about clearer separation/isolation. For example I've written small daemons using fork() + exec() because it seems safe and easy to do at the start.

In short, fork() mixes poorly with multi-threaded code (and has some security footguns like needing to explicitly unshare elements of environment which may be sensitive, such as file descriptors (suddenly you need to know all the file descriptors used in the whole program from a single place in code)). Here is a well-written comment about fork() from David Chisnall: <https://lobste.rs/s/cowy6y/fork_road_2019#c_zec42d>

Additionally, the fork()+exec() idiom practically forces OS designers into a corner where they simply have to implement Copy-on-Write for virtual memory pages, or otherwise the whole userspace using this idiom is going to be terribly slow. Without the fork()+exec() idiom you don't need CoW to be efficient.

  • Fork mixes so poorly with multithreaded code that a lot of modern languages that are built from the beginning with threads of one sort or another in mind, like Go, simply won't let you do it. There is no binding to fork in the standard library.

    I think you could bash it together yourself with raw syscalls, because that can't really be stopped once you have a syscall interface, but basically the Go runtime is built around assuming it won't be forked. I have no idea what would happen to even a "single threaded" Go program if you forked it, and I have no intention of finding out. The lowest level option given in the syscall package is ForkExec: https://pkg.go.dev/syscall#ForkExec And this is a package that will, if you want, create new event loops outside of the Go runtime's control, set up network connections outside of the runtime's control, and go behind the runtime's back in a variety of other ways... but not this one. If you want this, you'll be looking up numbers yourself and using the raw Syscall or RawSyscall functions.

    • > I have no idea what would happen to even a "single threaded" Go program if you forked it, and I have no intention of finding out.

      I'm not an expert on Go internals, but the GC in Go is multithreaded, so I would assume forking will kill the GC. Better hope it's not holding any mutexes.

  • TL;DR if another thread is holding a lock when you fork that lock will be stuck locked in the child, but that thread that was using that lock no longer exists.

    So if your multi-threaded program uses malloc you may fork while a global allocation lock is being held and you won't be able to use malloc or free in the child (thread-local caches aside).

    There are other problems but this is the basic idea. To be fork-safe you need to allow any thread to just disappear (or halt forever) at any point in your program.

    • malloc has to guard its locks against fork, probably using pthread_atfork, or some lower level internal API related to that.

      The problem with pthread_atfork is third party libs.

      YOU will use it in YOUR code. The C library will correctly use it in its code. But you have no assurance that any other libraries are doing the right things with their locks.

      6 replies →

  • fork came first; it's POSIX threads that is a bolted on piece of clunk that mixes badly with fork, signal handlers, chdir, ...