Comment by amalcon

3 years ago

Interestingly (confusingly), Linux's OOM killer is invoked for a different notion of OOM than a null return from malloc / bad_alloc exception. On a 64-bit machine, the latter will pretty much only ever happen if you set a vsize ulimit or you pass an absurd size into malloc. The OOM killer is the only response when you actually run out of memory.

If you want to avoid your program triggering the OOM killer all on its own, you need to set up a vsize such that you'll get an application level error before actually exhausting memory. Even that isn't completely foolproof (obviously anyone with a shell can allocate a large amount of RAM), but in practice -- if your program is the only significant thing on the system -- you can get it to be very reliable this way.

Add in some cgroup settings and you should be able to keep your program from being OOM killed at all, though that step is a bit more complex.

I wonder if it is possible to avoid OOM by making sure that all allocations are done from a named (on disk, not shm) memory file. This way in principle is always possible to swap to disk and never overcommit.

I guess in practice the kernel might be in such dire straits that it is not able to even swap to disk and might need to kill indiscriminately.

  • You would also need to prevent overcommit of disk; you'd typically mmap to a sparse file, and then you've got the same problem of overcommit on disk as you did in memory.

    If you're going to do drastic things, you can configure Linux's memory overcommit behavior, although strictly avoiding overcommit usually results in trouble from software not written with that in mind.