Comment by ants_everywhere
5 days ago
It's reasonable to expect people to know how to use their tools well.
If you know how to set up and sharpen a hand plane and you use them day in and day out, then I will listen to your opinion on a particular model of plane.
If you've never used one before and you write a blog post about running into the same issues every beginner runs into with planes then I'm going to discount your opinion that they aren't useful.
> It's reasonable to expect people to know how to use their tools well.
This shows the core of the flaw in the argument.
"The tool is great. If the result is not perfect, it is the user to blame."
It's unfalsifiable. The LLM can provide terrible results for reasonable prompts, but the response is never that LLMs are limited or have flaws, but that the user needs to prompt better or try again with a better LLM next week.
And more importantly, this is for the good case where the user has the discernment and motivation to know that the result is bad.
There are going to be lots of bad outputs slipping past human screeners, and many in the AI crowd will say "the prompt was bad", or "that model is obsolete, new models are better" ad infinitum.
This isn't to say that we won't get LLMs that produce great output with imperfect prompts eventually. It just won't be achieved by blaming the user rather than openly discussing the limitations and working through them.
> This shows the core of the flaw in the argument.
> "The tool is great. If the result is not perfect, it is the user to blame."
That's not what the parent said. A tool can be useful without being perfect. A result can be good without being perfect.
LLMs are power tools, and can be dangerous (to your codebase, your data, or your health) if used improperly[0].
If you hold the chainsaw wrong and saw off your foot, it's not the chainsaw's fault[1].
[0] In this case "properly" means understanding they are nondeterministic, can hallucinate, the output will vary and should be verified, and that GIGO still applies.
[1] The Altmans of the industry do the technology a great disservice by claiming "AGI achieved", "will replace workers", "PhD level intelligence" and "it just works". It's false marketing, plain and simple. When you set expectations sky high, of course any tech will disappoint.
The funniest part is that you can't get better than the input you put in.
Only when the thing can generalize from a few top notch examples would this problem be solved.
Throwing in a ton of garbage into an automated remixer will produce garbage output that cannot be helped by better classifying it. Even with the 0.01% false positive there is so much trash input the system will learn and reproduce that.
So the future should be developing few shot learning further...
It's reasonable for tools to produce reasonable, predictable output to enable them to be used well. A tool can have awful, dangerous failure modes as long as they're able to be anticipated and worked around. This is the critical issue with AI, it's not deterministic.
And because it always comes up, no, not even if temperature is set to 0. It still hinges on insignificant phrasing quirks, and the tiniest change can produce drastically different output. Temperature 0 gives you reproducibility but not the necessary predictability for a good tool.
I don't think the "non-deterministic" accusation is a good one. Same as "hallucination", it's a bit of misdirection.
These LLMs are buggy. They have bugs. They don't do what they promise. They do it sometimes, other times they give garbled output.
This is buggy software. And after years and billions of dollars, the bug persists.
Yes we've all heard the AIs are not deterministic trope ad nauseam , but that's unrelated to my point.
MCMC is also not deterministic, and yet people learn how to use it well. Being non-deterministic is kind of the whole point of anything based on statistics. It's deterministic conditioned on the seed.
MCMC is reliably predictable. I believe I made it clear in my last comment that that was the goal, not actual run-to-run determinism which is achievable.