Comment by sakras

3 years ago

Fantastic article! I have a project with a similar arena allocator, so I'll definitely be taking some of these tricks. One thing my allocator does do is organize arenas into a linked list so that you can grow your size dynamically. However I really like the article's point that you're always going to be living within _some_ memory budget, so you might as well allocate everything up front into a giant arena, and then divide the giant arena up into smaller arenas.

Also I've heard that you can save an instruction when checking if your allocator is full by subtracting from the top, and checking the zero flag. It seems to complicate alignment logic. Does that ever end up mattering?

> However I really like the article's point that you're always going to be living within _some_ memory budget, so you might as well allocate everything up front into a giant arena, and then divide the giant arena up into smaller arenas.

That depends. If you’re running on e.g. a video game console where you’re the sole user of a block of pretty much all memory, go ahead. On a system with other things running, you generally don’t want to assume you can just take some amount of memory, even if it’s “just the free memory”, or even “I probably won’t use it so it will be overcommitted”. Changing system conditions and other system pressure are outside of your control and your reservation may prevent the system from doing its job effectively and prioritizing your application appropriately.

  • Yeah, profiling is your friend. I forget if it's called a sharded slab or a buddy allocator, but the one where you have different preallocated buffers chunked at different sizes. Any time you allocate you are given the smallest chunk that will hold what you asked for. Profiling gives you optimal size boundaries as well as the number of each. Add a safety margin and off you go. Super fast allocation and guaranteed no fragmentation. In a c++ codebase overloading std::new to do this is probably the easiest way to get your allocation performance back and avoid fragmentation.

  • > If you’re running on e.g. a video game console where you’re the sole user of a block of pretty much all memory

    Games consoles haven't been that for a long time. PS5 and XSS are full blown multi-user multi-application systems. PS4 and Xbox One were multi user systems with reserved blocks for the OS, but still very close to a modern OS.

> so you might as well allocate everything up front into a giant arena, and then divide the giant arena up into smaller arenas

However if you do this note how the article hints at this strategy needing a bit more code on Windows: Windows doesn't do overcommit by default. If you do one big malloc Windows will grow the page file to ensure it can page that much memory in if you start writing to it. That's fine if you allocate a couple megabytes, but if your area is gigabytes in size you want to call VirtualAlloc with MEM_RESERVE to get a big contiguous memory area, then call VirtualAlloc with MEM_COMMIT as needed on chunks you actually want to use.