Comment by tovej

9 hours ago

But that seems obvious. You can't load an integer from an unaligned address.

It's not only C-level is it. There's no (guarantee across architectures for) machine code for that either.

25 comments

tovej

> You can't load an integer from an unaligned address.

You can, and the results are machine specific, clearly defined and well-documented. Ancient ARM raises an exception, modern ARM and x86 can do it with a performance penalty. It's only the C or C++ layer that is allowed to translate the code into arbitrary garbage, not the CPU.

saagarjha 7 hours ago
There’s usually not a performance penalty on modern hardware
- orlp 3 hours ago
  
  There's typically only a performance penalty if the unaligned load spans a cache line on modern hardware.

matheusmoreira 9 hours ago

Sure you can. In many architectures it works just fine. Works perfectly in x86_64, for example. It's just a little slower.

tovej 8 hours ago
In many architectures does not mean you can. The standard is supposed to cover all architectures.
- matheusmoreira 7 hours ago
  
  If some architecture traps on unaligned access, then the compiler can and should simply generate the correct code so that it loads the integer piece by piece instead. Load multiple integers and shift and mask away the irrelevant bits, done. This is exactly what modern architectures already do in hardware. Works, it's just a little slower.
  This is exactly what the compilers do if you use a packed structure to access unaligned data. Works everywhere, as expected. Compilers have always known what to do, they just weren't doing it. C standard says no.
  The fact is the standard is garbage and the first thing every C programmer should learn is that they can and should ignore it. There is never any reason to wonder what the standard is supposed to do. The only thing that matters is what compilers actually do.
  
  10 replies →
- crote 7 hours ago
  
  That's why we write C instead of assembly, isn't it?
  You could also mandate that a compiler for architectures without unaligned access either has to prove that the access is going to be aligned or insert a wrapper to turn the unaligned access into two aligned ones.
  Just pretending the issue doesn't exist at all and making it the programmer's problem by leaving it as UB in the spec is a choice.

mbel 9 hours ago

Unless your code targets some exotic architecture, like idk x86.

cataphract 8 hours ago
Not really. Wait until the compiler starts vectorizing your code and using instructions requiring alignment (like the ones with A or NT in the mnemonic).
- saagarjha 6 hours ago
  
  Usually the compiler will probably not generate those
  
  1 reply →

pjc50 9 hours ago

You missed the point: the pointer existing as a value of that type at all is UB, even if you never try to access anything through it and no corresponding machine code is ever emitted.

tovej 8 hours ago
Yes? I agree with that. I don't really see the issue there. The computer will allocate data in aligned addresses, so you would have to be doing something weird to begin with to access unaligned pointers. And aligned access is always better anyway. I guess packed structs are a thing if you're really byte golfing. Maybe compressed network data would also make sense.
But then I would assume you are aware of unaligned pointers, and have a sane way to parse that data, rather than read individual parts of it from a raw pointer.
I am curious, what would be a legitimate reason for an unaligned pointer to int?
- simonask 4 hours ago
  
  String search algorithms would be one example, where a 64-bit register can be used as a “vector” containing 8x1 bytes.
  
  1 reply →