Comment by jeffbee
1 day ago
I don't think that's really a position that can be defended. Both jemalloc and tcmalloc evolved and were refined in antagonistic multitenant environments without one overwhelming application. They are optimal for that exact thing.
> Both jemalloc and tcmalloc evolved and were refined in antagonistic multitenant environments without one overwhelming application. They are optimal for that exact thing.
They were mostly optimised on Facebook/Google server-side systems, which were likely one application per VM, no? (Unlike desktop usage where users want several applications to run cooperatively). Firefox is a different case but apparently mainline jemalloc never matched Firefox jemalloc, and even then it's entirely plausible that Firefox benefitted from a "selfish" allocator.
Google runs dozens to hundreds of unrelated workloads in lightweight containers on a single machine, in "borg". Facebook has a thing called "tupperware" with the same property.
It's possible that they were referring to something specific about their platform and its system allocator, but like I said it was an anecdote about one engineer's statement. I just remember thinking it sounded fair at the time.
The “system” allocator is managing memory within a process boundary. The kernel is responsible for managing it across processes. Claiming that a user space allocator is greedily inefficient is voodoo reasoning that suggests the person making the claim has a poor grasp of architecture.
For context, the "allocator engineer" I was talking to was a kernel engineer - they have an extremely solid grasp of their platform's architecture.
The whole advantage of being the platform's system allocator is that you can have a tighter relationship between the library function and the kernel implementation.
There are shared resources involved though, for example one process can cause a lot of traffic in khugepaged. However I would point out that is an endemic risk of Linux's overall architecture. Any process can cause chaos by dirtying pages, or otherwise triggering reclaim.
The "greedy" part is likely not releasing pages back to the OS in a timely manner.
3 replies →