← Back to context

Comment by tpush

3 years ago

> If a situation has been statically determined to invoke UB that should be a compile time error.

That's simply not how the compiler works.

There is (presumably, I haven't actually looked) no boolean function in GCC called is_undefined_behavior(). It's just that each optimization part of the compiler can (and does) assume that UB doesn't happen, and results like the article's are then essentially emergent behavior.

See also: https://blog.regehr.org/archives/213

It is undefined behaviour if I write GCC --hlep

Does that mean it's acceptable for GCC to reformat my hard drive?

Just because something is UD doesn't give anyone a license to do crazy things.

If I misspell --help I expect the program to do something reasonable. If I invoke UD I still expect the program to do something reasonable.

Removing checks for an overflow because overflows 'can't happen' is just crazy.

UD is supposed to allow C to be implemented on different architectures if you don't know whether it will overflow to INT_MIN it makes sense to leave the implementation open. If I, the user knows what happens when an int overflows then I should be able to make use of that and guard against it myself. A compiler undermining that is a bug and user hostile.

  • > It is undefined behaviour if I write GCC --hlep

    No, it's not, and I don't know why you'd think so. UB is a concept applying to C programs, not GCC invocations.

    > UD is supposed to allow C to be implemented on different architectures if you don't know whether it will overflow to INT_MIN it makes sense to leave the implementation open. If I, the user knows what happens when an int overflows then I should be able to make use of that and guard against it myself.

    I think you're confusing UB with unspecified and implementation defined behavior. It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.

    • This has come up before, because, in some technical sense, the C standard does indeed not define what a "gcc" is, so "gcc --help" is undefined behavior according to the C standard, because the C standard does not define the behavior. By the same token, instrument flight rules are undefined behavior.

      A slightly less textualist approach to language recognizes that when we talk about C and UB, we mean behavior, which is undefined, of operations otherwise defined by the C standard.

      4 replies →

    • >No, it's not, and I don't know why you'd think so. UB is a concept applying to C programs, not GCC invocations

      What should happen when I invoke --hlep then? The program could give an error, could warn that it's an unrecognised flag. Could ask you if you meant --help. Infer you mean help and give you that, or it could give you a choo Choo train running across the screen. Or it could reformat your hard drive. Just because it isn't specifically listed as UD doesn't mean it's not. If it isn't defined then it's undefined. The question is what is the reasonable thing to do when someone types --hlep. I hope we can agree reformating your hard drive isn't the most reasonable thing to do.

      >I think you're confusing UB with unspecified and implementation defined behavior

      Am I? What's the reason for not defining integer overflow? Yes unspecified behaviour could be used to allow portability, but so can undefined.

      >It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.

      I'm not saying it shouldn't be UB. I'm saying there's reasonable and unreasonable things to do when you encounter UB. In the article the author took reasonable steps to protect themselves and the compiler undermined that. That isn't reasonable. In exactly the same way that --hlep shouldn't lead to my hard drive getting reformatted.

      C gives you enough rope to hang yourself. It isn't required for GCC to tie the noose and stick your head in it though.

      I think you're confusing UB with unspecified and implementation defined behavior

      10 replies →

    • > It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.

      I'm glad I don't live in your country, where the C standard has been incorporated into law, making it illegal for compiler writers to do things that are helpful to programmers and end users, but aren't required by the standard.

  • > UD is supposed to allow C to be implemented on different architectures

    No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures. In those cases, the implementation must define the behavior itself, and stick with it. UB, on the other hand, exists for compiler authors to optimize.

    If you want to be mad at someone, be mad at the C standard for defining so much stuff as UB instead of implementation-defined behavior. Integer overflow should really be implementation-defined instead.

    • >No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures.

      Is it? We're talking about integer overflow here.

      I wasn't in the meetings when writing all the c standards. I'm not convinced this is purely an optimisation thing though.

      I would guess the story is more.

      Interested party X: "can integer overflow do X?"

      Party Y: "no because our processor doesn't work like that.

      Party Z: "and it breaks K and R"

      Party X: "how about implementation defined?"

      Party A: "but our compiler targets 5 different processors"

      Party B: "plus that precludes certain optimisations"

    • Not only to optimize but to write safety tools. If you defined all the behavior, and then someone used some rare behavior like integer overflow by accident, it'd be harder to detect that since you have to assume it was intentional.

    • > UB, on the other hand, exists for compiler authors to optimize.

      Was this really the original reason why there's UB in the C standard, or has this been retconned by 'malicious compiler authors'? ;)

      2 replies →

    • >UB, on the other hand, exists for compiler authors to optimize.

      s/exists for/has been exploited by/g

      The worst part is the optimizations aren't even that significant. (I recall a blog post of somebody testing this but I can't find it rn)

  • It is undefined behaviour if I write GCC --hlep

    Well no, it's a compilation error, you need at the very least a semicolon after hlep and from there on it depends on what GCC is. If it's a function you need parentheses around --hlep, if it's a type you need to remove the --, if it's a variable you need to put a semicolon after it,...

    Because GCC is all-caps I'm guessing it's a macro, so here's an example of how you could write it (though it won't be UB): https://godbolt.org/z/dYMddrTjj

    • I'm not sure if you're supporting my pov by showing the absurdity of the other position???

      Yeah sure, if my phone auto incorrects gcc to GCC then that is technically meaningless so you're completely free to interpret my comment how you want.

      ..... Although..... GCC stands for GNU Compiler Collection so it can be reasonably capitalised, so maybe then, rather than saying anything goes we should do something reasonable because then you aren't left saying something really stupid if you're wrong???

      1 reply →