Comment by js8
25 days ago
Compilation is not deterministic, see JITs and GCs. What is deterministic is the resulting program output, but not its performance. So with compilers, we traded away the determinism over performance in exchange for ease of programming.
With LLMs, we are trading away the determinism of the program output as well, in exchange for even more easier programming. Is it a good or bad thing? There are ways to mitigate the problem, just like there are with compilers.
You could argue the determinism of the program output was never really there, because the specification at the high enough level was always unclear. So we are not really losing that much, just accepting more messy reality.
Then the only question remains, can these computer programs (LLMs) do a better job (and where) than a SW developer, who is supposed to translate unclear specifications into a formal language (source code). It happened with compilers - eventually they got better than all of assembler programmers. Same happened to chess players.
> Compilation is not deterministic, see JITs and GCs. What is deterministic is the resulting program output, but not its performance.
Does JIT compiles some other program code instead of the one being run? Does it produce bytecodes for a differenr VM? Does it tries to compile parts of the program that have not been executed or aren’t going to be?
Does GC destroy objects being in use? Does it ignores instances and memory that has been properly released?
JITs and GC are deterministic algorithms, you can predict its behavior by just reading their code. LLM tooling involves an actual random generator for its output.
> Does JIT compiles some other program code instead of the one being run? Does it produce bytecodes for a different VM? Does it tries to compile parts of the program that have not been executed or aren’t going to be?
Sure, but the same is true for LLMs - the lead models no longer make trivial mistakes like answering "What is the capital of France?" wrong.
> JITs and GC are deterministic algorithms, you can predict its behavior by just reading their code.
On large enough systems, you can't, just like it's difficult to predict weather. Determinism has little to do with it. At work, I have just witnessed a bug in JIT (it seems to have been fixed in OpenJDK 25). It inlined a wrong method. We weren't able to reproduce the error conditions without a private customer dataset.
And the fact is, historically, there have been many bugs in compilers, or they have been bad at their job, writing performant programs. The output (resulting program) of a good compiler is difficult to understand (because it is written to be efficient). LLMs (for the programming use case) are different quantitatively, not qualitatively.
It’s really weird how you shift the goalposts and your own definitions.
No one is saying that a compiler can’t have bugs. What we have been saying is that if we take the compiler has a blackbox, we’re reasonably certain given we know the input, what the outputs will be. And the output will stay the same if you keep the input the same.
But you can send the LLM the same prompt, and it will gives you a different answer each time. And it’s not even about the verbiage used.
3 replies →