← Back to context

Comment by uecker

1 day ago

Most of the "ghosts" are indeed just cleaning up the wording. But compiler writers historically often used any excuse that the standard is not clear to justify aggressive optimization. This starts with an overreaching interpretation of UB itself, to wacky concepts such as time-travel, wobbly numbers, incorrect implementation of aliasing (e.g. still in clang), and pointer-to-integer round trips.

I'm sure the compiler authors will disagree that they were "using any excuse". From their point of view they were merely making transformations between equivalent programs, and so any mistake is either that these are not in fact equivalent programs because they screwed up - which is certainly sometimes the case - or the standard should not have said they were equivalent but it did.

One huge thing they have on their side is that their implementation is concrete. Whatever it is that, say, GCC does is de facto actually a thing a compiler can do. The standards bodies (and WG21 has been worse by some margin, but they're both guilty) may standardize anything, but concretely the compiler can only implement some things. "Just do X" where X isn't practical works fine on paper but is not implementable. This was the fate of the Consume ordering. Consume/ Release works fine on paper, you "just" need to have whole program analysis to implement it. Well of course that's not practical so it's not implemented.

  • They sometimes screwed up, sometimes just because of bugs, or because different optimization passes had different assumptions that are inconsistent. This somehow contradicts your second point. Compiler have something things implemented which may be concrete on some sense (because it is in a compiler), but still not really a "thing" because it is a mess nobody can formalize using a coherent set of rules.

    But then, they also sometimes misread the standard in ways I can't really understand. This often can be seen when the "interpretation" changes over time. Earlier compilers (or even earlier parts of the same compiler) implement the standard as written, some new optimization pass has some creative interpretation.

    • Certainly compiler developers are only human, and many of them write C++ so they're humans working with a terrible programming language, I wouldn't sign up for that either (I have written small contributions to compilers, but not in C++). I still don't see "any excuses". I see more usual human laziness and incompetence, LLVM for example IMNSHO doesn't work hard enough to ensure their IR has coherent semantics and to deliver on those semantics.

      The compiler bug I'm most closely following, and which I suspect you have your eye on too is: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119472 aka https://github.com/rust-lang/rust/issues/107975 https://github.com/llvm/llvm-project/issues/45725

      But it seems like it's just that everybody fucked this up in similar ways, that's two different major compiler backends! I wouldn't be surprised if Microsoft (whose code we can't see) find that they don't get this quite right either.

      4 replies →