← Back to context

Comment by Someone

3 years ago

> If a situation has been statically determined to invoke UB that should be a compile time error.

But you typically can’t prove that. There’s lots of code where you could prove it might happen at runtime for some inputs, but proving that such inputs occur would, at least, require whole-program analysis. The moment a program reads outside data at runtime, chances are it becomes impossible.

If you want to ban all code that might invoke it it boils down to requiring programmers to think about adding checks around every addition, multiplication, subtraction, etc. in their code, and add them to most of them. Programmers then would want the compiler to include such checks for them, and C would no longer be C.

If, as you seem to say, you want to ban a subset that’s easily provable, I think enabling all warnings already does that. See for example https://clang.llvm.org/docs/DiagnosticsReference.html#wargum..., https://clang.llvm.org/docs/DiagnosticsReference.html#warray..., https://clang.llvm.org/docs/DiagnosticsReference.html#winteg... , https://clang.llvm.org/docs/DiagnosticsReference.html#wcompa...

C will accept every valid program, at the cost of also accepting some invalid programs. Rust will reject every invalid program, at the cost of also rejecting some valid ones.

("unsafe" (aka "trust me" mode) means that's not quite true, and so do some of the warnings and errors that you can enable on a C compiler, but it's close enough)

> But you typically can’t prove that. There’s lots of code where you could prove it might happen at runtime for some inputs, but proving that such inputs occur would, at least, require whole-program analysis. The moment a program reads outside data at runtime, chances are it becomes impossible.

No, I specifically ruled out doing that in my comment.

I was referring to the situation where a null check was deleted because the compiler found UB through static analysis.

(Or specifically, placing a null check after a possibly-null usage. It is wrong to assume that after possibly-null usage the possibly-null variable is definitely-null.)

  • As I recall, the compiler didn't know it had found undefined behaviour. An optimisation pass saw "this pointer is deferenced", and from that inferred that if execution continued, the pointer can't be null.

    If the pointer can't be null, then code that only executes when it is null is dead code that can be pruned.

    Voila, null check removed. And most relevantly, it didn't at any point know "this is undefined behaviour". At worst it assumed that dereferencing a null would mean it wouldn't keep executing.

    • It removed redundant check then, why not warn about that? gcc -Wpedantic even warns about empty statements fcol.

  • The compiler didn't find UB. What it saw was a pointer dereference, followed by some code later on that checked if the pointer was null.

    Various optimisation phases in compilers try to establish the possible values (or ranges) of variables, and later phases can then use this to improve calculations and comparisons. It's very generic, and useful in many circumstances. For example, if the compiler can see that an integer variable 'i' can only take the values 0-5, it could optimise away a later check of 'i<10'.

    In this specific case, the compiler reasoned that the pointer variable could not be zero, and so checks for it being zero were pointless.

    • Yes - and the original post here is the same:

          if (x < 0)
              return 0;
      

      The compiler now knows x's possible range is non-negative.

          int32_t i = x * 0x1ff / 0xffff;
      

      A non-negative multiplied and divided by positive numbers means that i's possible range is also non-negative (this is where the undefinedness of integer overflow comes in - x * 0x1ff can't have a negative result without overflow occurring).

          if (i >= 0 && i < sizeof(tab)) {
      

      The first conditional is trivially true now, because of our established bounds on i, so it can just be replaced with "true". This is what causes the code to behave contrary to the OP's expectations: with his execution environment in the overflow case we can end up with a negative value in i.

    • It is probably more precise to say “if the pointer is null, then it doesn’t matter what I do here, so I am permitted to eliminate this” than to say that it can’t be null here. (It can’t be both null and defined behavior.)

      1 reply →

  • Consider function inlining, or use of a macro to for some generic code. For safety, we include a null check in the inlined code. But then we call it from a site where the variable is known to not be null.

    The compiler hasn't found UB through static analysis, it has found a redundant null check.

  • > I was referring to the situation where a null check was deleted because the compiler found UB through static analysis.

    You can say that but in practice -Onone is fairly close to what you're asking for already. Most people are 100% unwilling to live with that performance tradeoff. We know that because almost no one builds production software without optimizations enabled.

    The compiler is not intelligent. It just tries to make deductions that let it optimize programs to run faster. 99.999% of the time when it removes a "useless" null check (aka branch that has to be predicted and eat up branch prediction buffer space and bloats up the number of instructions) it really is useless. The compiler can't tell the difference between the useless ones and security critical ones because all of them look the same and are illegal by the rules of the language.

    Even if you mandate that null checks can't be removed that doesn't fix all the other situations where inserting the relevant safety checks have huge perf costs or where making something safe reduces to the halting problem.

    FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

    To illustrate the complexity: most loops end up using an int which is 32-bit on most 64-bit platforms so if you require signed integer wrapping that slows down all loops because the compiler must insert artificial checks to make the 64-bit register perform 32-bit wrapping and we can't change the size of int at this point.

    • FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

      To accomodate trapping implementations you'd have to make it "implementation-defined or an implementation-defined signal is raised" which it happens is exactly the wording for when an out-of-range value is assigned to a signed type. In practice it means you have to avoid it in your code anyway because "an implementation-defined signal is raised" means "your program may abort and you can't stop it".

  • But again, the compiler did not find UB through static analysis. The compiler inferred that the pointer could not be null and removed a redundant check.

    For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

    • The compiler made a dangerous assumption that the standard permits ("the author surely has guaranteed, through means I can't analyze, that this pointer will never be null").

      Then it encountered evidence explicitly contradicting that assumption (a meaningless null check), and it handled it not by changing its assumption, but by quietly removing the evidence.

      > For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

      If it can infer it from actually good evidence, sure. But using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence that only the C standard could come up with.

      11 replies →

You typically can't prove it, but if and when you can prove it, you should definitively warn about it or even refuse to compile.

Things like that meaningless null check mentioned, can definitively be found statically (the meaningless arithmetic sanity check in OP's example, I'm not so sure, at least not with C's types).

  • So, how much effort should the standard require a compiler to make for “if and when you can prove it”? You can’t, for example, reasonably require a compiler to know whether Fermat’s theorem is true if that’s needed to prove it.

    There are languages that specify what a compiler has to do (e.g. Java w.r.t. “definite assignment” (https://docs.oracle.com/javase/specs/jls/se9/html/jls-16.htm...)), and thus require compilers to reject some programs that otherwise would be valid and run without any issues, but C chose to not do that, so compilers are free to not do anything there.

    • Everyone wants to drag nontermination into this, but in the OP's example, the compiler already had proof that a the conditional would never evaluate to true. What you can or can't prove in the bigger picture isn't so interesting when we already have the proof we need right now.

      It's just that it used this proof to remove the conditional evaluation (and the branch) instead of warning the user that he was making a nonsensical if statement.

      So to the question of "when can we hope to do it" the answer is, "not in all cases, sure, but certainly in this case".