Comment by jeffbee
3 days ago
I don't think that's really a position that can be defended. Both jemalloc and tcmalloc evolved and were refined in antagonistic multitenant environments without one overwhelming application. They are optimal for that exact thing.
> Both jemalloc and tcmalloc evolved and were refined in antagonistic multitenant environments without one overwhelming application. They are optimal for that exact thing.
They were mostly optimised on Facebook/Google server-side systems, which were likely one application per VM, no? (Unlike desktop usage where users want several applications to run cooperatively). Firefox is a different case but apparently mainline jemalloc never matched Firefox jemalloc, and even then it's entirely plausible that Firefox benefitted from a "selfish" allocator.
Google runs dozens to hundreds of unrelated workloads in lightweight containers on a single machine, in "borg". Facebook has a thing called "tupperware" with the same property.
I think Tupperware was rebranded to Twine sometime about 6-7 years ago.
It's possible that they were referring to something specific about their platform and its system allocator, but like I said it was an anecdote about one engineer's statement. I just remember thinking it sounded fair at the time.
The “system” allocator is managing memory within a process boundary. The kernel is responsible for managing it across processes. Claiming that a user space allocator is greedily inefficient is voodoo reasoning that suggests the person making the claim has a poor grasp of architecture.
There are shared resources involved though, for example one process can cause a lot of traffic in khugepaged. However I would point out that is an endemic risk of Linux's overall architecture. Any process can cause chaos by dirtying pages, or otherwise triggering reclaim.
1 reply →
For context, the "allocator engineer" I was talking to was a kernel engineer - they have an extremely solid grasp of their platform's architecture.
The whole advantage of being the platform's system allocator is that you can have a tighter relationship between the library function and the kernel implementation.
1 reply →
The "greedy" part is likely not releasing pages back to the OS in a timely manner.
8 replies →