← Back to context

Comment by harry8

4 years ago

So for software archtiectures that separate concerns by spawning many short-lived processes and using message passing, (which seems like a great idea, just can't think of anything that does that, would love examples if they exist) it /could/ be a factor but we have no numbers. Do you see it?

Let's just say I want to design a solution involving spawning a buttload of procesesses and pass messages back and forward. Roughly when does fork efficiency become something other than of academic concern? 10 processes per second, 1000, 100000? What does the inefficiency look like? Nothing? A stutter you might not notice? Through to everything grinds to a halt and you can't login to the box and neither will the oom killer help you.

That's a fair question. Basically, don't call fork() in Java (JNI or alike), or Java classes that do, and you might be fine, and if ever you're not, you'll know where to start looking.

  • Don't ever call fork from java? Not even once? And what are the consequences of calling fork? A minor stutter? Halt and catch fire? I don't java but it's hardly new tech. Surely someone has done some numbers on competing operating systems in the past couple of decades?

    Until you quantify on some level, even very roughly, what the observed issue is, when you see it and how it degrades, that you're trying to optimize it's just urinating into the breeze. We might get lucky is the best outcome. The chances of it being a really good outcome are pretty limited. decrying something as "inefficient" based on big O or whatever is just meaningless until we actually do it. [1]

    [1] selection sort is O(n^2) and can totally dominate O(n log n) algorithms in actual time and cycles spent depending on circumstance. We have to specify, it's not something that can be shortcut because it will likely get a terrible result.

    • I have had to debug slow forking cases with Java. No I can't point you at data from those. I can point you to the Microsoft paper and @famzah's posts if you want data. For Microsoft this is an important topic: they don't want to have to implement a real fork(), and I fully understand why they don't want to. My guess is they will eventually buckle and do it. fork() is not easy to implement.