Comment by tekne
19 hours ago
Wait... why?
Making an unreliable, nondeterministic system give reliable results for a bounded task with well-understood parameters is... like half of engineering, no?
There's a huge difference between "generate this code here's a vague feature description" and "here's a list of criteria, assign this input to one of these buckets" -- the latter is obviously subject to prompt engineering, hallucination, etc -- but so can a human pipeline!
>the latter is obviously subject to prompt engineering, hallucination, etc -- but so can a human pipeline!
...which is why we write deterministic code to take the human out of the pipeline. One of the early uses of computers was calculating firing tables for artillery, to replace teams of humans that were doing the calculations by hand (and usually with multiple humans performing each calculation to catch errors). If early computers had a 99% chance of hallucinating the wrong answer to an artillery firing table, the response from the governments and militaries that used them would not be to keep using computers to calculate them. It would be to go back to having humans do it with lots of manual verification steps and duplicated work to be sure of the results.
If you're trying to make LLMs (a vague simulacrum of humans) with their inherent and unsolvable[1] hallucination problems replace deterministic systems, people are going to eventually decide to return to the tried and true deterministic systems.
1: https://arxiv.org/abs/2401.11817
Because it's not possible. There is nothing you can say to the LLM that will guarantee that something happens. It's not how it works. It will maybe be taken into consideration if you're lucky.
But if you're trying to tell me that every time you list criteria you get them all perfectly matched, you're clearly gifted.
I'm being deliberately pedantic, but depending on what kind of representation we use for the neural network (due to rounding) as well as the choice of inference (that is, given a distribution for next token, which one to choose), it can absolutely be reproducible and completely deterministic.
Though chaotic, which I believe is the better word here - a single letter change may result in widely different results.
We just choose to use more random inference rules, because they have better results.
With determinism you're not wrong. The problem is that you'd need to make sure all your seeds, temperatures, and other input parameters are exactly the same, and importantly that all context is cleared. But people don't do that. And I'm not sure every if even any provider lets you set those parameters.