← Back to context

Comment by saagarjha

3 years ago

Yes, sorry for not being precise: UB applies to executions. When I said "global" I meant global over that entire execution, so if your path ends up hitting undefined behavior it can go back and logically undo its entire execution, including parts which it shared with a well-defined execution or where you'd generally expect side effects to be placed.

No, that logically doesn't make sense. The program cannot know whether it is going through a particular execution ahead of time without actually executing all the side effects along that path first (which in this case would include the fflush()). The very difference between a "program" and a "program execution" is the fact that an execution includes the interactions of the program with the external world (as defined by the standard, all of which I loosely called "inputs" in my previous comment). The interactions basically extend prefixes of the execution through performing the semantics of the program according to the abstract machine and observing the responses from the external world. You don't have an "execution" of the program until the point of UB, until the interactions (aka side effects) up to that point have first occurred (and the responses of the system observed for continuing the execution).

P.S. Have you ever seen a single example of a compiler time-traveling UB through observable behavior like this? I sure haven't. If you have, I'd love to see it, because despite all the crazy ways compilers take advantage of UB, I've never seen C/C++ compilers actually agree with the stance that this way would be somehow legal (if it's even logically possible).

  • What about the following code:

        if (x > 4) {
            fflush();
            int y = INT_MAX + 1;
        }
    

    Can the compiler not use that to assume that (x > 4) is false because otherwise it triggers undefined behavior? Hence it is allowed to drop the entire branch?

    The only real counter-argument I could see is "fflush might terminate the program, hence we need to run the function before we know if UB will be triggered". I suppose once you call a function that the compiler cannot analyze (e.g. system-calls, FFIs) the compiler may not be certain the function doesn't contain an 'exit()' call.

    • That's right, I think. If you replace the "fflush()" (which should have an argument by the way) with "f()" and declare "void f(void);" then the test and the call appear in the binary. But if you declare "__attribute__((pure)) void f(void);" then the test and the call disappear.

      2 replies →

    • > The only real counter-argument I could see is "fflush might terminate the program, hence we need to run the function before we know if UB will be triggered".

      The thing to realize is there is no such thing as "UB will be triggered". The only thing that exists is "UB is triggered", combined with the as-if rule, which allows modifications that don't affect what the standard considers observable behavior. Or in other words, the standard defines a program according to its observable behavior. People think it's time-travel because they think of the program in terms of expressions and statements rather than side effects, but if you think of the programs in terms of observable behaviors rather than the lines of code executing, you see that there's no time travel.

    • A bit pointless example, because that "int y..." is going to be pruned away anyways, since the result is not used anywhere.

      Hence it won't trigger any undefined behavior.

      1 reply →

  • A quote from the C standard:

    "However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)."

    Quote found here: https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...

    It seems that the standard explicitly disagrees with you.

    • The "[...] executing that program with that input [...]" part maybe could be read as making it specific to a given UB triggering execution; but I'm no language lawyer :).

      2 replies →

  • I agree with your sentiment, but the way I square that with what I mentioned is that the compiler can undo side effects. As far as I am aware there is nothing special about fflush in the standard where you can't go back to where the program was before it happened.

    (I have never actually seen a compiler act on this, but I maintain that this is just because they're either not willing to optimize on this or unable to do so. But there's a lot of UB that compilers do not exploit, so this isn't particularly concerning to me.)