Comment by masklinn
1 day ago
glibc will return memory to the OS just fine, the problem is that its arena design is extremely prone to fragmentation, so you end up with a bunch of arenas which are almost but not quite empty and can't be released, but can’t really be used either.
In fact, Jason himself (the author of jemalloc and TFA) posted an article on glibc malloc fragmentation 15 years ago: https://web.archive.org/web/20160417080412/http://www.canonw...
And it's an issue to this day: https://blog.arkey.fr/drafts/2021/01/22/native-memory-fragme...
glibc does NOT return memory to the OS just fine.
In my experience it delays it way too much, causing memory overuse and OOMs.
I have a Python program that allocates 100 GB for some work, free()s it, and then calls a subprocess that takes 100 GB as well. Because the memory use is serial, it should fit in 128 GB just fine. But it gets OOM-killed, because glibc does not turn the free() into an munmap() before the subprocess is launched, so it needs 200 GB total, with 100 GB sitting around pointlessly unused in the Python process.
This means if you use glibc, you have no idea how much memory your system will use and whether they will OOM-crash, even if your applications are carefully designed to avoid it.
Similar experience: https://sourceware.org/bugzilla/show_bug.cgi?id=14827
Open since 13 years ago. This stuff doesn't seem to get fixed.
The fix in general is to use jemalloc with
which tells it to immediately munmap() at free().
So in jemalloc, the settings to control this behaviour seem to actually work, in contrast to glibc malloc.
(I'm happy to be proven wrong here, but so far no combination of settings seem to actually make glibc return memory as written in their docs.)
From this perspective, it is frightening to see the jemalloc repo being archived, because that was my way to make sure stuff doesn't OOM in production all the time.
You could do the memory heavy python part in a separate process as well. That removes the need to depend on quirks of the allocator