Comment by jstimpfle
7 hours ago
You can run your code under ASAN and UBSAN nowadays, it will catch many or most of issues as they happen.
But that's completely besides the point. UB on signed overflow, or really most of UB, is not unrelated to C flexibility. It is a detail of the spec related to portability and performance. IIRC it is even required to make such trivial optimizations as turning
for (int i = 0; i < n; i++) func(a[i]);
into
for (Foo *p = a, *last = a + n; p < last; p++) func(p);
saving arithmetics and saving a register, on architectures where `int` is smaller than pointers. But there is also options like -fwrapv on GCC for example, allowing you to actually use signed overflow.
How is undefined behavior necessary for this transformation?
IIRC computation of the address is done by computing offset from base pointer as a multiplication in (32-bit) int, (like p + (i * sizeof (Foo)). The right term might overflow, but due to signed overflow being UB, the compiler is able to assume that it does not, so the transformation to do the arithmetic entirely in (64-bit) pointer space is valid.
Exactly. You as the programmer know that the loop counter won't overflow, and in general, essentially nobody would actually write it that way. But if you don't assume it can't happen, the possibility for signed overflow is everywhere in address computations.
This is also a major blocker for auto-vectorization. Can't coalesce a load of a[i], a[i+1], a[i+2], a[i+3] into a load of a[i:i+3] if there's a possibility that `i+1`, `i+2` or `i+3` wrapped around (thus causing your "contiguous" load to be non-contiguous). This is a big reason why you shouldn't use `unsigned` for loop counters, especially if they're going to be used as an index into an address calculation.
*is not related to C flexibility