Comment by epistasis
3 hours ago
Thanks for the EarlyOOM pointer, it's one that I found (from HN) on my investigation of why an entire process group was getting killed rather than single processes.
The problem is not that OOM killing happens earlier under memory pressure, but rather the problem is what gets killed. Previously an offending process would get killed. Now it's an entire cgroup. So if you are using process isolation to run a batch of computation jobs, each of which takes different amounts of memory and it is not foreseeable which will take too much memory until runtime, the OOM killer takes out the batch manager and its shell and everything. So the process can't know ahead of time if it's taking too much memory, because allocations never fail, and the process itself shouldn't be monitoring what is going on the rest of the system to make run time decisions to quit. The entire batch of jobs is killed, rather than a single process dying (as happens for any number of errors) and continuing in with the rest of the batch of jobs. In fact, without interacting directly with systemd-run to create a new cgroup, it's impossible to monitor WTF happened to your process because of this new "nuke it from orbit" behavior.
During my searches on this another common error case is in an IDE where one process goes wild and takes too much memory, and then the whole IDE gets killed silently instead of single process killing allowing the app to save state.
This is a very fundamental change to how Linux has worked, it's a novel concept unfamiliar to long time users (who the fuck actually knows about cgroups or uses them extensively except for people heavy int containerization?), and workarounds for the behavior require introducing heavy dependency on systems in order to get basic functionality, making my code far less portable. I can understand being dependent on GNU, and some linuxisms in syscalls, but changing the basic semantics of launching new processes such that new code dependencies are needed for intricate cgroup control, well, that's a bit much for me. Leave systems-oomd to manage cgroups and containers, but having it manage desktop apps and standard Unix process launching leads to bad code.
No comments yet
Contribute on Hacker News ↗