← Back to context

Comment by mrweasel

1 day ago

Looking at all the comments and lightly browsing the source code, I'm amazed. Both at how much impact a memory allocator can make, but also how much code is involved.

I'm not really sure what I expected, but somehow I expect a memory allocator to be ... smaller, simpler perhaps?

Memory allocators can be simple. In fact it was an assignment for a course in the 2nd year of my CS degree to make an (almost) complete allocator.

However it is typically always more complex to make production quality software, especially in a performance sensitive domain.

  • Naive allocators are very easy: just subdivide RAM and defragment only when absolutely necessary (if virtual memory is unavailable). Performant allocators are hard.

    I think we lost a great deal of potential when ORCA was too tied to Pony and not extracted to a framework, tool, and/or library useful outside of it such as integrated or working with LLVM.

You can write a simple size-class allocator (even lock-free) in just a couple dozen lines of code. (I've done it both for interviews and for a work presentation.) But an allocator that is fast, scalable, and performs well over diverse workloads--that is HARD.

It’s the same way with garbage collectors.

You can write a naive mark-and-sweep in an afternoon. You can write a reference counter in even less time. And for some runtimes this is fine.

But writing a generational, concurrent, moving GC takes a lot of time. But if you can achieve it, you can get amazing performance gains. Just look at recent versions of Java.

mimalloc is cleaner but lacks the very useful profiling features. To be fair it also has not gone through decades of changes as described in the postmortem either.