Comment by AlexeyBrin
12 days ago
This is an exaggeration, if you store the prompt that was "compiled" by today's LLMs there is no guarantee that in 4 months from now you will be able to replicate the same result.
I can take some C or Fortran code from 10 years ago, build it and get identical results.
That is a wobbly assertion. You certainly would need to run the same compiler, forgo any recent optimisations, architecture updates and the likes if your code has numerical sensitive parts.
You certainly can get identical results, but it's equally certainly not going to be that simple a path frequently.
> You certainly can get identical results, but it's equally certainly not going to be that simple a path frequently.
But at least I know that if I need to, I can do it. With an LLM, if you don't store the original weights, all bets are off. Reproducibility of results can be a hard requirement in certain cases or industries.
The more important point is that even when you don’t get identical binary output, you still get identical observable behavior as specified by the programming language, unless there’s a compiler bug. That’s not the case for LLMs, they are more like an always randomly buggy compiler. You wouldn’t want to use such a compiler.