← Back to context

Comment by matheusmoreira

3 years ago

Excellent article.

> While you could make a big, global char[] array to back your arena, it’s technically not permitted (strict aliasing).

Aren't char pointers/arrays allowed to alias everything?

I used that technique in my programming language and its allocator. It's freestanding so I couldn't use malloc. I had to get memory from somewhere so I just statically allocated a big array of bytes. It worked perfectly but I do disable strict aliasing as a matter of course since in systems programming there's aliasing everywhere.

char can alias everything, i.e. you can deference a char pointer with impunity, even if the actual dynamic type[1] of an object is a different type. The reverse is not true: if the dynamic type of an object is char, just by using the alias rules, you can't deference it as an object of a different type.

In C++ you can just use placement new to change the dynamic type of (part of ) a char array (but beware of pointer provenance). In C is more complex: I don't claim to understand this fully, but my understanding is you can't change the type of a named object, but you should be able to change the type of anonymous memory (for example, what is allocated with malloc) by simply writing into it.

In practice at least GCC considers the full alias rules unimplementable and my understanding is that it uses a conservative model where every store can change the type of an object, and uses purely structural equivalence.

[1] of course most C implementations don't actually track dynamic types at runtime.

  • > The reverse is not true: if the dynamic type of an object is char, just by using the alias rules, you can't deference it as an object of a different type.

    A limitation like that simply makes no sense to me. Everything is a valid char array but I can't place structs on top of one? Oh well, nothing I can do about it. I'll just keep strict aliasing disabled. If we're writing C, it's because we want to do stuff like that without the compiler getting clever about it.

    > you should be able to change the type of anonymous memory (for example, what is allocated with malloc) by simply writing into it

    Well, in my case, I'm the one writing the malloc and the buffer is the anonymous memory. I remember months ago I scoured the GCC documentation for some kind of builtin that would allow me to mark the memory as such but there was nothing. I did add some malloc attributes to my allocation function just like TFA suggested but apparently its main purpose is to optimize based on aliasing nonsense which I disabled anyway.

    • At some point you need to get the memory for your pool from somewhere, for example from malloc [1], hence you can safely set the type of the raw memory by writing into it. You can also change that type to some other type, by writing other stuff (so you can reuse the memory). You can also write metadata to it between uses. What you cannot do is having overlapping lifetimes for different types.

      Making sure that you respect all the underspecified, obscure, and often contradicting rules is not easy, so if you prefer to disable strict-alias, you have my sympathy. For the most part is useful for high performance numerical code, and less advantageous for typical pointer chasing stuff.

      From the practical point of view, the safest way to implement a custom allocator is to make sure that the compiler can't see through it, so separate compilation and no LTO and/or launder your pointers through appropriate inline asm.

      [1] but other 'anonymous' sources, like mmap, would also work in practice.

      1 reply →

    • > Everything is a valid char array but I can't place structs on top of one? Oh well, nothing I can do about it. I'll just keep strict aliasing disabled.

      Well, yeah. Strict aliasing is less about the incidental values of memory addresses and more about the actual semantics of what you're doing. Where writing a struct into the middle of a char array makes no sense because you have no guarantee in the type system that the array is properly sized or aligned to contain that struct.

      2 replies →

    • I feel like there should be a builtin that takes a void* and returns a void* which basically marks the input as aliased by the returned pointer regardless of the TBAA rules.

      1 reply →

If you overlay a struct in your (char) buffer and dereference a pointer to it you would be accessing something with a different type than its declared type(char* as struct something *), it’s strict aliasing violation

To do stay in the rules you could set up a void* to suitable region in a linkerscript