Comment by roblabla

3 years ago

> UD is supposed to allow C to be implemented on different architectures

No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures. In those cases, the implementation must define the behavior itself, and stick with it. UB, on the other hand, exists for compiler authors to optimize.

If you want to be mad at someone, be mad at the C standard for defining so much stuff as UB instead of implementation-defined behavior. Integer overflow should really be implementation-defined instead.

>No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures.

Is it? We're talking about integer overflow here.

I wasn't in the meetings when writing all the c standards. I'm not convinced this is purely an optimisation thing though.

I would guess the story is more.

Interested party X: "can integer overflow do X?"

Party Y: "no because our processor doesn't work like that.

Party Z: "and it breaks K and R"

Party X: "how about implementation defined?"

Party A: "but our compiler targets 5 different processors"

Party B: "plus that precludes certain optimisations"

Not only to optimize but to write safety tools. If you defined all the behavior, and then someone used some rare behavior like integer overflow by accident, it'd be harder to detect that since you have to assume it was intentional.

> UB, on the other hand, exists for compiler authors to optimize.

Was this really the original reason why there's UB in the C standard, or has this been retconned by 'malicious compiler authors'? ;)

  • It is the original reason. For example, register allocation is possible because stack smashing is UB.

    • UB is also very much based around software incompatibilities though, not just the ability to optimise stuff.

      But where IB can have useful definitions to document, UB was defined so because the behaviours were considered sufficiently divergent that allowing them was useless, and so it was much easier to just forbid them all.

But then again, UB doesn't mean the compiler author can't treat it as implementation-defined and do something reasonable.

  • You're getting it backward. UB doesn't immediately stop compilation only due to implementation defined backward compatibility, just because you don't want to break compilation of existing programs each time the compiler converges to the C spec and identified an implementation of undefined behavior.

    And since you want some cross-compiler compatibility, you also import's third parties implementation defined UB.

    This is not some conceptual reasonable decision, the proper way would be to throw out compilation on each UB behavior. The reality is that the proper way would be too harsh on existing codebase, making people use a less strict compiler or not updating version, which are non-desirable effects for compilers writers.

    • I can't really follow. What would be wrong with making -fwrapv the default? i.e. let the compiler assume signed integers are two's complement on according platforms (i.e. virtually everything in use today). Then stop assuming "a + 1 < a" cannot be true for signed ints. How would that make existing code worse, or break it? It's basically what you already get with -O0 afaict, so any such program would be broken with optimizations turned off.

      2 replies →

>UB, on the other hand, exists for compiler authors to optimize.

s/exists for/has been exploited by/g

The worst part is the optimizations aren't even that significant. (I recall a blog post of somebody testing this but I can't find it rn)