← Back to context

Comment by avidiax

5 days ago

> It's reasonable to expect people to know how to use their tools well.

This shows the core of the flaw in the argument.

"The tool is great. If the result is not perfect, it is the user to blame."

It's unfalsifiable. The LLM can provide terrible results for reasonable prompts, but the response is never that LLMs are limited or have flaws, but that the user needs to prompt better or try again with a better LLM next week.

And more importantly, this is for the good case where the user has the discernment and motivation to know that the result is bad.

There are going to be lots of bad outputs slipping past human screeners, and many in the AI crowd will say "the prompt was bad", or "that model is obsolete, new models are better" ad infinitum.

This isn't to say that we won't get LLMs that produce great output with imperfect prompts eventually. It just won't be achieved by blaming the user rather than openly discussing the limitations and working through them.

> This shows the core of the flaw in the argument.

> "The tool is great. If the result is not perfect, it is the user to blame."

That's not what the parent said. A tool can be useful without being perfect. A result can be good without being perfect.

LLMs are power tools, and can be dangerous (to your codebase, your data, or your health) if used improperly[0].

If you hold the chainsaw wrong and saw off your foot, it's not the chainsaw's fault[1].

[0] In this case "properly" means understanding they are nondeterministic, can hallucinate, the output will vary and should be verified, and that GIGO still applies.

[1] The Altmans of the industry do the technology a great disservice by claiming "AGI achieved", "will replace workers", "PhD level intelligence" and "it just works". It's false marketing, plain and simple. When you set expectations sky high, of course any tech will disappoint.

The funniest part is that you can't get better than the input you put in.

Only when the thing can generalize from a few top notch examples would this problem be solved.

Throwing in a ton of garbage into an automated remixer will produce garbage output that cannot be helped by better classifying it. Even with the 0.01% false positive there is so much trash input the system will learn and reproduce that.

So the future should be developing few shot learning further...