Comment by ajkjk

1 day ago

Fork always seemed conceptually terrible even when I first learned about it.. If you want to do one thing (start a process) you should not have to use a mysterious incantation that does a different unrelated thing (forks your process) in order to do it.

I am curious about what the best way to handle the example in the article of one process spawning many git subprocesses is. Surely it just doesn't make sense to repeatedly start git from scratch in the course of a long-running parent operation. What's the low cost abstraction for the same result, though?

11 comments

ajkjk

spacechild1 1 day ago

Yeah, as someone who originally came from Windows, the fork+exec model never made sense to me. Now I know it's just a historical quirk, but for some reason there are still people who pretend that fork+exec is actually a good thing...

kps 21 hours ago

Fork is conceptually simple. Without bringing in any other layers, you start a process with the one thing known to exist: yourself.

Otherwise you need multiple steps to create a process, fill it with something to run, and arrange for it to execute. Or like Win32 you permanently smush them together with other layers, like filesystems and object loaders and linkers.

Too 14 hours ago
Fill with what stuff exactly?
The only thing I want to inherit from the parent process is its cwd and environment variables, even those are often overridden. The rest can easily be passed explicitly through other channels like pipes or command line arguments.
Back to the example from the article. It makes no sense that a git-subprocess forked from a web server need to have any process state inherited from the web server.
- kps 2 hours ago
  
  > Fill with what stuff exactly?
  Yes, exactly. Cloning, as a process creation primitive, is the one thing that doesn't need to be concerned with other stuff.
  > … a git-subprocess forked from a web server …
  That's pulling in a whole load of assumptions that are distinct from process creation. You can have processes in an environment that has no concept of file system or persistent storage at all.
ajkjk 14 hours ago
I gues that way of thinking makes sense if you have a certain model of what a process is, in terms of the data structures and runtime state etc. But, tbh, I think of processes as glorified function calls, which happen to have that stuff involved as an implementation detail. And if spawning a process call is supposed to act like a function call, then of course it should not inherit state. You should call the function you want to call, not call yourself with an instruction to switch over to it instead.
- fluffybucktsnek 9 hours ago
  
  Conceptually, processes are more akin to units of isolation. Threads are closer to function calls.
IshKebab 10 hours ago
It's not conceptually simple. No other object creation API works by copying an existing thing and then modifying it. You don't create a new file by copying an existing one and then modifying it. You don't create a new window by copying an existing one and modifying it.
Attempting to justify clone/exec as a reasonable design is just Stockholm syndrome.
- kps 2 hours ago
  
  > No other object creation API works by copying an existing thing and then modifying it.
  Clone-and-modify is pretty common in CAD.
  > You don't create a new file by copying an existing one and then modifying it.
  Clone-and-modify is almost universal in version control systems.
  
  1 reply →

wmf 1 day ago

libgit2 exists. You could imagine communicating with some gitd over a pipe/socket but I don't know why that would be a good idea. Short of that you have to spawn processes.

trumpdong 1 day ago

On Windows maybe it would be a COM server, using IPC built into the OS. The client sees it like a local function call.