Comment by iforgotpassword

3 years ago

It's still braindead and idiotic. Every relevant platform nowadays has well defined overflow for signed ints. A sane C compiler should go with that and base its optimizations on it. GCC has been a pile of garbage in this regard for many years now. Its devs get further removed from reality with every year. Treating signed int overflow as undefined should be hidden behind a flag.

The C/C++ language doesn't provide for a way for the compiler to see that you really meant this one check to take precedence over the implicit promise in another.

The reason why C++ is always relevant here (though C macros and inlining cause similar issues) is that generic programming being close to optimal is a language feature - and one of the ways that's possible is by letting you right reusable code that might be "called" from a context in which some of the checks or conditions just aren't necessary. It's by design that the optimizer gets to... well, optimize that kind of code.

There's a solid case to be made that the details of C's UB weren't well chosen and we should try to update them; but which decades old choices are perfect? Which are easy to change once there's this much legacy software in operation?

Don't forget that some of those UB's were chosen to deal with hardware realities of the day; i.e. that the "same" operation on different hardware would do different things. For example, eliminating signed integer overflow might allow a C compiler to use a signed register that's wider than necessary, which may help on hardware that doesn't have every possible register width, or where there are complex register usage limitations. I'm no hardware geek; I'm sure somebody here knows or real examples where UB allows portability, because that's the point: UB allows people to write portable, performant code - just don't do certain things, and you're fine... which leads us to today's situation, in which UB can feel like a minefield.

  • > Don't forget that some of those UB's were chosen to deal with hardware realities of the day; i.e. that the "same" operation on different hardware would do different things.

    That's an argument for implementation defined behaviour. Not for undefined behaviour, at least not UB in the modern sense.

    • Having implementation defined behavior would imply non-portability. C compilers have all kinds of ways of exposing platform-specific features, but sneaking those into what looks like standard behavior has its own issues. And even if you accept that, that doesn't deal with the issue of inlining, generics, and macros - you can get different implementation defined behavior even in a single hardware implementation like that.

      If that is what you want, compilers have various flags that let you in essence do that. But the next problem with that is (1) that it's possible existing code may be suddenly and unpredictably lose performance, and (2) now you need to provide some other well-defined behavior for those UB cases, and (3) the selling point of generics/macros/inlining may be reduced.

      How many relevant UB's are there? I don't know. How much perf would code common lose? I don't know. To be sure, I fully acknowledge that removing UB from the spec may be the right thing to do, but it's also easy enough to find possible problems with that strategy; I'm just pointing out the complexities, which is a lot easier than solving them or knowing which are irrelevant.

      1 reply →

  • The problem is not UB per se -- the problem is that the compiler uses UB to make assumptions that are incorrect.

    Removing a comparison because of UB is fucking stupid. The compiler on the one hand assumes that the programmer is diligent enough to consider of every invocation of UB, but on the other hand too stupid to see the check they wrote will always be true.

    It's not a good idea.

    • Checks that are always true _in some context_ are entirely normal and by design if the code can be used in a different context. If your code is reused in a way that let's the optimizer re-optimize the code per-context, then you'll benefit from the compiler's ability to remove dead code or even merely to choose less expensive special case ops. Macros, templates and inlining are some common ways that happens, but platform-specific builds and perhaps others exist too.

      For example, imagine you have some SIM wide value, and you want to do something to each word or byte that the SIMD value contains. In today's C, you can just write a bunch of ifs: is width < 2? then... is width < 4? then... etc. The compiler with completely elide those ifs and leave behind only the reachable code - if it can specialize that re-used code for the given context.

      Furthermore, today those checks might be implicit via the use of UB. That's perhaps not a great solution looking at the entire ecosystem, but it is the situation we're in. Changing that might be quite a lot of work.

      3 replies →

    • I like that the compiler removes comparisons that always have the same result.

      It means I can write clear code, guard things rather than explain in a comment why the guard isn't needed, and know that the compiler will remove the inefficient code. In general, optimizing compilers mean that taking the clearer option is much less of a performance loss. I like that.

      In many of these UB cases, the annoying things is that the compiler removes the safety feature you explicitly added, but there are plenty of alternatives.

    • The compiler doesn't make assumptions that are incorrect (it is what it is, basically, if it assumes no integer overflows there better not be).

      Your application, and the application programmer, instead fail to fulfill those compiler assumptions.

Signed int overflow being UB is one of the most basic UBs of the language, and what allows generating tight code in loops.

This is not new, -fwrapv was introduced in 2003, but it can quite severely impact code quality, if you don’t care, just set that. Then complain that C is slow, because C is a shit language.

  • > and what allows generating tight code in loops.

    How so? How does breaking an if statement the programmer added make the code faster? If they intended the check not to happen/be required, they wouldn't have written it. Let signed int overflow and leave any code that depends on its value alone. So yes maybe make fwrapv the default.

    > because C is a shit language.

    Well, it's as low level as it can get before reaching assembly, but why not try reducing the number of foot guns? Sometimes you still need C, and that's not going to go away for the foreseeable future.

    • > How so?

      A somewhat common example I've seen is sign extension in loops, where the width of the loop variable is not the same as that of the CPU register [0]. If the compiler can assume that signed integer overflow is UB, then it has a lot more freedom to unroll/vectorize the loop [1] (remove -fwrapv and watch Clang go to town).

      Of course, that specific optimization is rendered somewhat moot if the programmer chooses to use a 64-bit loop variable, but that is a slightly different rabbit hole.

      > If they intended the check not to happen/be required, they wouldn't have written it.

      I feel that's somewhat iffy reasoning - if we trust the programmer so much, why allow the implementation to optimize in the first place? And if not to that extreme, where should the line be drawn?

      [0]: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

      [1]: https://godbolt.org/z/EMaq1j3Kc

      3 replies →

    • > How does breaking an if statement the programmer added make the code faster? If they intended the check not to happen/be required, they wouldn't have written it.

      See your problem is that you’re

      1. not thinking like a compiler

      2. and reasoning on an isolated example

      The compiler does not “break an if statement”, the compiler uses the UB to limit the range of the input and output, it can then propagate this range analysis to see that the check is dead code, and so removes the dead code.

      It’s common for users to write unnecessary or redundant checks, even more so because of inlining, and especially macros.

      If you’re carefully checking for null in every function prologue, and the compiler in-line everything and knows the pointer is non-null, all checks are dead and can be removed. Which is what the compiler does. This reduces the amount of branches (and thus the space needed by the branch predictor), and reduces the amount of code meaning the new inlined function could fall below threshold and itself become a candidate for inlining.

      1 reply →

It's not about what your CPU does.

These days undefined overflow for signed integers is mostly used by compilers to be able to assume that eg 'a + 1 > a' is always true, and thus eliminate redundant checks.

(And you wouldn't typically write code like 'a + 1 > a', but you can get either from code generation via macros etc or as a intermediate result from previous optimization passes.)