← Back to context

Comment by pjc50

3 years ago

> But you typically can’t prove that. There’s lots of code where you could prove it might happen at runtime for some inputs, but proving that such inputs occur would, at least, require whole-program analysis. The moment a program reads outside data at runtime, chances are it becomes impossible.

No, I specifically ruled out doing that in my comment.

I was referring to the situation where a null check was deleted because the compiler found UB through static analysis.

(Or specifically, placing a null check after a possibly-null usage. It is wrong to assume that after possibly-null usage the possibly-null variable is definitely-null.)

As I recall, the compiler didn't know it had found undefined behaviour. An optimisation pass saw "this pointer is deferenced", and from that inferred that if execution continued, the pointer can't be null.

If the pointer can't be null, then code that only executes when it is null is dead code that can be pruned.

Voila, null check removed. And most relevantly, it didn't at any point know "this is undefined behaviour". At worst it assumed that dereferencing a null would mean it wouldn't keep executing.

  • It removed redundant check then, why not warn about that? gcc -Wpedantic even warns about empty statements fcol.

The compiler didn't find UB. What it saw was a pointer dereference, followed by some code later on that checked if the pointer was null.

Various optimisation phases in compilers try to establish the possible values (or ranges) of variables, and later phases can then use this to improve calculations and comparisons. It's very generic, and useful in many circumstances. For example, if the compiler can see that an integer variable 'i' can only take the values 0-5, it could optimise away a later check of 'i<10'.

In this specific case, the compiler reasoned that the pointer variable could not be zero, and so checks for it being zero were pointless.

  • Yes - and the original post here is the same:

        if (x < 0)
            return 0;
    

    The compiler now knows x's possible range is non-negative.

        int32_t i = x * 0x1ff / 0xffff;
    

    A non-negative multiplied and divided by positive numbers means that i's possible range is also non-negative (this is where the undefinedness of integer overflow comes in - x * 0x1ff can't have a negative result without overflow occurring).

        if (i >= 0 && i < sizeof(tab)) {
    

    The first conditional is trivially true now, because of our established bounds on i, so it can just be replaced with "true". This is what causes the code to behave contrary to the OP's expectations: with his execution environment in the overflow case we can end up with a negative value in i.

  • It is probably more precise to say “if the pointer is null, then it doesn’t matter what I do here, so I am permitted to eliminate this” than to say that it can’t be null here. (It can’t be both null and defined behavior.)

    • I'm not sure that's right. The compiler isn't tracking undefined behaviour, it is tracking possible values. It just happens that one specific input into determining these values is the fact "a valid program can't dereference a null pointer", so if the source code ever dereferences a pointer, the compiler is free to reason that the pointer cannot therefore be null.

      In essence, the compiler is allowed to assume that your code is valid and will only do valid things.

Consider function inlining, or use of a macro to for some generic code. For safety, we include a null check in the inlined code. But then we call it from a site where the variable is known to not be null.

The compiler hasn't found UB through static analysis, it has found a redundant null check.

> I was referring to the situation where a null check was deleted because the compiler found UB through static analysis.

You can say that but in practice -Onone is fairly close to what you're asking for already. Most people are 100% unwilling to live with that performance tradeoff. We know that because almost no one builds production software without optimizations enabled.

The compiler is not intelligent. It just tries to make deductions that let it optimize programs to run faster. 99.999% of the time when it removes a "useless" null check (aka branch that has to be predicted and eat up branch prediction buffer space and bloats up the number of instructions) it really is useless. The compiler can't tell the difference between the useless ones and security critical ones because all of them look the same and are illegal by the rules of the language.

Even if you mandate that null checks can't be removed that doesn't fix all the other situations where inserting the relevant safety checks have huge perf costs or where making something safe reduces to the halting problem.

FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

To illustrate the complexity: most loops end up using an int which is 32-bit on most 64-bit platforms so if you require signed integer wrapping that slows down all loops because the compiler must insert artificial checks to make the 64-bit register perform 32-bit wrapping and we can't change the size of int at this point.

  • FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

    To accomodate trapping implementations you'd have to make it "implementation-defined or an implementation-defined signal is raised" which it happens is exactly the wording for when an out-of-range value is assigned to a signed type. In practice it means you have to avoid it in your code anyway because "an implementation-defined signal is raised" means "your program may abort and you can't stop it".

But again, the compiler did not find UB through static analysis. The compiler inferred that the pointer could not be null and removed a redundant check.

For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

  • The compiler made a dangerous assumption that the standard permits ("the author surely has guaranteed, through means I can't analyze, that this pointer will never be null").

    Then it encountered evidence explicitly contradicting that assumption (a meaningless null check), and it handled it not by changing its assumption, but by quietly removing the evidence.

    > For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

    If it can infer it from actually good evidence, sure. But using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence that only the C standard could come up with.

    • > using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence

      Do you think the compiler would be right to remove the second check here?

         if (!x) std::abort();
         if (!x) return;
         ... = *x;
      

      What about changing std::abort with the following?

         [[noreturn]] void my_abort();
      

      How's that different form a check after dereferencing a pointer? In both cases the check can be removed because dataflow or control flow analysis.

      What if my_abort returns instead? Or another thread changes x after the fact?

      10 replies →