← Back to context

Comment by ncruces

19 hours ago

So how can you implement an allocator in the language itself? Not even talking about malloc (which gets to have language blessed semantics); say an arena allocator.

You get a bunch of memory from mmap. There are no “objects” (in C terminology) in there, and a single “provenance” (if at all, you got it from a syscall, which is not part of the language).

If arbitrary integer/pointer math inside that buffer is not OK, how do to get heterogeneous allocations from the arena to work? When do slices of it become “objects” and gain any other ”provenance”?

Is C supposed to be the language you can't write an arena allocator in (or a conservative GC, or…)?

Handling allocators correctly it is actually quite problematic. In C++ you would placement-new into the raw storage, which ends the lifetime of whatever was there and start the lifetime of a new object, and as long as you use the pointer returned by operator new (or use std::launder), formally you are ok.

Famously you cannot implement an allocator in C into static named storage; I understand that on anonymous memory (like that returned by sbrk or mmap, or an upstream allocator) it should work fine, but, as a C++ programmer, I'm not familiar with the specific details of the C lifetime model that allow it. I understand that stores into anonymous memory can change the dynamic type (or whatever is the C equivalent) of the object.

In any case the issue is around object lifetimes and aliasing instead of pointer provenance: you can treat the anonymous memory is just a char array and you can safely form pointers into it and will carry the correct provenance information.

You don't that is why people that never bothered with ISO C legalese have this idea of several tricks being C, when they are in practice "Their C Compiler" language.

Several things in C, if you want to stay within the guarantees of ISO C for portable code, have to be written in straight Assembly, not even inline, as the standard only defines a asm keyword must exist, leaving all other details to the implementation.