← Back to context

Comment by jordemort

4 years ago

If you mix fork with threads, you're going to have a [undefined behavior] time. It seems like if you link with the sqlite that comes with macOS, you're using threads whether you like it or not. I think ending up at "you shouldn't use fork() at all" is a bit of an extreme conclusion, though.

BTW, article title needs a (2016). It appears that the relevant Python bug has long since been closed, by avoiding linking with the system sqlite on macOS.

> I think ending up at "you shouldn't use fork() at all" is a bit of an extreme conclusion, though.

Is it? There are more descriptive (as opposed to procedural) APIs which behave in a safer and more well-defined manner to do it these days. Unless you're implementing a shell, fork has never been a great tool.

As one commenter noted 3 months back:

> The dense fog lifts, tree branches part, a ray of light beams down on a pedestal revealing the hidden intentions of the ancients. A plaque states "The operational semantics of the most basic primitives of your operating system are designed to simplify the implementation of shells." You hesitantly lift your eyes to the item presented upon the pedestal, take a pause in respect, then turn away slumped and disappointed but not entirely surprised. As you walk you shake your head trying to evict the after image of a beam of light illuminating a turd.

https://news.ycombinator.com/item?id=30504470

  • As the author of a gist that trashes on fork(), I do nonetheless use it, usually early in daemons' lives:

      - to daemonize
      - to fork multiple worker processes
    

    And maybe POSIX-ish shells should use fork() for subshells, naturally.

    But I think that's about it for good uses of fork().

    For all process spawning uses of fork() I strongly recommend vfork() or posix_spawn() instead.

    • Isn't the first use-case a pretty debatable / bad one? By daemonizing internally, you make service management and supervision of the program much more difficult, and if you include a non-daemonizing mode for debugging you now have two different runmodes with a pretty significant semantics difference, only one of which is easily inspectable.

      4 replies →

  • IMO, fork is strictly better than threads as a tool for having operations perform off of the main thread; they get all the state they need at the beginning and they can use IPC to return the result.

    TFA does allow for off-process operations, but all of the inputs to the operation would need to be passed explicitly. In this sense, I suppose TFA isn't arguing against multiprocessing per-se, but against the specific type that implicitly includes all of the current process state (which has both up- and down-sides).

    • > In this sense, I suppose TFA isn't arguing against multiprocessing per-se, but against the specific type that implicitly includes all of the current process state (which has both up- and down-sides).

      You don't have to suppose anything, TFA specifically says that you should use posix_spawn or immediately exec() after forking.

      It doesn't imply or hint, let alone say, that threads are superior, it only mentions them because they interact badly with fork() and that's the issue they'd hit. It's not like threads are the only thing which interacts badly with fork.

    • That is basically how NGINX works if you run it in daemon mode. When you start or reload the server, the main process initializes common state then forks to become a worker process. Although I would recommend avoiding any IPC past that if possible

fork + threads is not undefined behavior. It is safe as long as you only do “async-signal safe” functions in the child. The child will be single-threaded.

The Linux man pages have a list of safe operations after fork here: https://man7.org/linux/man-pages/man7/signal-safety.7.html

Note that this includes most of your standard syscalls, like (importantly) write(), read(), close(), chdir(), as well as certain “obviously safe” library functions like strlen(), memcpy(), etc.

Non-multithreaded programs can fork() how they like and do whatever they want after (mostly).

  • > fork + threads is not undefined behavior. It is safe as long as you only do “async-signal safe” functions in the child. The child will be single-threaded.

    Yes, but the async-signal-safe restriction is pretty severe, so you have to know what you're doing. Yes, that's also true of vfork(), but at least vfork() will be much faster.

    > Non-multithreaded programs can fork() how they like and do whatever they want after (mostly).

    Only as long as they haven't used libraries that are not fork-safe prior to calling fork(). And you still need to do things like fflush() stdio handles prior to fork()ing.