Comment by petcat
11 hours ago
> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.
Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".
https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...
I count 121 of them.
If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.
The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.
I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.
I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.
Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.
2 replies →