← Back to context

Comment by beeforpork

9 hours ago

The UB in unaligned pointers is even worse: an unaligned pointer in itself is UB, not only an access to it. So even implicit casting a void*v to an int*i (like 'i=v' in C or 'f(v)' when f() accepts an int*) is UB if the cast pointer is not aligned to int.

It is important to understand that this is a C level problem: if you have UB in your C program, then your C program is broken, i.e., it is formally invalid and wrong, because it is against the C language spec. UB is not on the HW, it has nothing to do with crashes or faults. That cast from void* to int* most likely corresponds to no code on the HW at all -- types are in C only, not on the HW, so a cast is a reinterpretation at C level -- and no HW will crash on that cast (because there is not even code for it). You may think that an integer value in a register must be fine, right? No, because it's not about pointers actually being integers in registers on your HW, but your C program is broken by definition if the cast pointer is unaligned.

Author here.

> an unaligned pointer in itself is UB

Yup. Per the "Actually, it was UB even before that" section in the post.

> UB is not on the HW, it has nothing to do with crashes or faults

Yeah. I tried to convey this too, but I'm also addressing the people who say "but it's demonstrably fine", by giving examples. Because it's not.

Which is totally fine and expected for any decent programmer. Casting pointers is clearly here be dragons territory.

  • Many, many programmers come to C (and C++) with a lower-level understanding that actually gets in the way here. They understand that all types "are" just bytes and that all pointers "are" just register-sized integer addresses, because that's how the hardware works and has worked for decades.

    It's perfectly reasonable to expect any load through `int*` to just load 4 bytes from memory, done and done. They get surprised that it is far from the whole story, and the result is UB.

    Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead. But no.

    • > They understand that all types "are" just bytes and that all pointers "are" just register-sized integer addresses, because that's how the hardware works and has worked for decades.

      I'd clarify this with "They understand that all values are just bytes".

      > Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead.

      It's partly the standards fault here - rather than saying "We don't know how vendors will implement this, so we shall leave it as implementation-defined", they say "We don't know how vendors will implement this, so we will leave it as undefined".

      A clear majority of the UB problems with C could be fixed if the standards committee slowly moved all UB into IB. It's not that there isn't any progress (Signed twos-complement is coming, after all), it's that there is (I believe) much pushback from compiler authors (who dominate the standards) who don't want to make UB into IB.

      27 replies →

    • > Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead. But no.

      Not if those 4 bytes span a cacheline boundary, that will most likely result in 1/2 throughput compared to loading values inside a single cacheline. And if it causes cache-misses it takes up twice the L2 or L3 bandwidth.

      Even worse, if the int spans two pages, it will need two TLB lookups. If it's a hot variable and the only thing you use from those pages, it even uses up an additional TLB entry, that could otherwise be used for better perf elsewhere, etc.

      And if you're on embedded (and many C programs are), Cortex-M CPUs either can't handle unaligned accesses (M0, M0+) or take 2-3 times as long (split the load into 2x2 byte or 1x2 + 2x1 byte)

      1 reply →

    • > that all pointers "are" just register-sized integer addresses

      And crucially until DR#260 https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm this was a reasonable guess as to what the pointers are. Probably not a wise guess because it's not how your C compiler worked even then, but a reasonable guess if you didn't think too hard about this.

      One way I like to think about this is that all C's types are just the machine integers wearing crap Halloween costumes. Groucho glasses for bool, maybe a Lincoln hat for char, float and double can be bright orange make-up and a long tie. But the pointers are different, because unlike the other types those have provenance.

      5 == 5, 'Z' == 'Z', true == true, 1.5f == 1.5f, but whether two pointers are equivalent does not depend solely on their bit pattern in C.

    • I'm not sure that's right. For instance, the Pentium 4 spec explicitly says unaligned int32 loads take longer. And x86/x64 is very gentle in that regard, other archs would whip you. So an unaligned int access is rightfully treated differently. It should be IB.

      Just creating the pointer, though, should not be UB, even though it apparently is. It should not even be IB.

      1 reply →

    • > Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead.

      PCs yes, but there are many other things C is compiled to for which this is not true.

      1 reply →

  • Yes but casting pointers is virtually required in any non-trivial C program, and frankly even a lot of the trivial ones, because there's no other way to do type erasure or generics. Well, there kind of are now, and there's always been macros, but void * has historically been the predominant way this is done at runtime.

>an unaligned pointer in itself is UB, not only an access to it.

Can someone point to where the standard states this?

Does that mean that if I have a struct with #pragma pack(push, 1) I can't use pointers to any members that don't happen to be aligned?

The problem with C UBI is that originally it meant the compiler has the freedom to map your code to the hardware inspite of machine instructions differing slightly between one another. The same C program may express different behaviour depending on which architecture it is running on.

This type of UB is fine and nobody really complains about hardware differences leading to bugs.

However, over time aggressive readings of UB evolved C into an implicit "Design by Contract" language where the constraints have become invisible. This creates a similar problem to RAII, where the implicit destructor calls are invisible.

When you dereference a pointer in C, the compiler adds an implicit non-nullable constraint to the function signature. When you pass in a possibly nullable pointer into the function, rather than seeing an error that there is no check or assertion, the compiler silently propagates the non-nullable constraint onto the pointer. When the compiler has proven the constraints to be invalid, it marks the function as unreachable. Calls to unreachable functions make the calling function unreachable as well.

  • > The problem with C UBI is that originally it meant the compiler has the freedom to map your code to the hardware inspite of machine instructions differing slightly between one another. The same C program may express different behaviour depending on which architecture it is running on.

    You're conflating undefined behavior with implementation-defined behavior. If it was only to do with what we think of as normal variance between processors, then it would be easy to make it implementation-defined behavior instead.

    The differentiating factor of undefined behavior is that there are no constraints on program behavior at that point, and it was introduced to handle cases where processor or compiler behavior cannot be meaningfully constrained. One key class is of course hardware traps: in the presence of compiler optimizations, it is effectively impossible to make any guarantees about program state at the time of a trap (Java tried, and most people agreed they failed); but even without optimizations, there are processors that cannot deliver a trap at a precise point of execution and thus will continue to execute instructions after a trapping instruction.

But that seems obvious. You can't load an integer from an unaligned address.

It's not only C-level is it. There's no (guarantee across architectures for) machine code for that either.

  • > You can't load an integer from an unaligned address.

    You can, and the results are machine specific, clearly defined and well-documented. Ancient ARM raises an exception, modern ARM and x86 can do it with a performance penalty. It's only the C or C++ layer that is allowed to translate the code into arbitrary garbage, not the CPU.

  • You missed the point: the pointer existing as a value of that type at all is UB, even if you never try to access anything through it and no corresponding machine code is ever emitted.

    • Yes? I agree with that. I don't really see the issue there. The computer will allocate data in aligned addresses, so you would have to be doing something weird to begin with to access unaligned pointers. And aligned access is always better anyway. I guess packed structs are a thing if you're really byte golfing. Maybe compressed network data would also make sense.

      But then I would assume you are aware of unaligned pointers, and have a sane way to parse that data, rather than read individual parts of it from a raw pointer.

      I am curious, what would be a legitimate reason for an unaligned pointer to int?

      2 replies →