← Back to context

Comment by amelius

10 hours ago

But if an LLM says "I don't know" should you pay for the tokens?

Why not? It did the work. Why should you expect it to be omniscient?

We can rank them based on how much they know and people will gravitate towards those that do know more.

It's a market after all.

  • If it’s a market, wouldn’t the incentive be to lie about knowing and thus to keep the hallucinations?

    • If you had an llm that could accurately predict when a claim is uncertain it would be very popular, I think. I would pay for that kind of reliability tbh

      3 replies →

    • Up to the point where consumers notice and decide to stop using these models because of it.

      Might be why we're already rarely seeing models output an "I don't know".

    • According to your logic the market will produce an LLM that consists only of 'PRINT "I don't know."'.

"I don't know" has positive value, presumably you could prompt further to learn more about where it got stuck. It also increases the value of correct answers, by improving confidence that answers are actually correct.

"Confidently incorrect" has negative value. At best, a human realizes the answer is wrong and At worst, the incorrect information makes is not identified and can cause untold damage. By having the potential to be so severely wrong, it lessens the value of correct answers because there is a lower confidence value on their output.

Depends on what your understanding of the product is.

If someone sold you a "Solved all your problems" machine, and it suddenly doesn't solve all your problems, then probably no, you shouldn't pay.

But the way I'm being sold LLMs, is basically "A text generator that gives your plausible-sounding human text that sometimes hallucinates and gets things wrong, based on your input", then regardless of what the outcome is, I still made use of the "Input > Output" part, which is what I bought into, so I should still pay for that.

Now of course bunch of people will say they been sold the former, but the companies themselves seem to be selling the latter. That's my perspective from a person who doesn't follow "influencers" and what not though, which seem to be selling the public on the former rather than the latter.

  • Let's pretend I am someone who has heard people talk about ChatGPT, but have on idea what it actually is. I go to the website and am not presented with any information, just a prompt. So I ask it what it is and what it can do for me.

    My ask:

    > In a couple sentences, explain to me the product I'm being sold with ChatGPT. What does it do for me?

    The Reply from ChatGPT:

    > ChatGPT is a conversational AI that helps you think, create, learn, analyze, and get things done faster. You can use it to answer questions, draft and edit writing, summarize information, brainstorm ideas, learn new topics, write code, plan projects, and increasingly act as an assistant that can search for information, work with documents, generate images, and help complete tasks.

    > In simple terms: you're buying access to an AI that turns natural language into useful work—saving time, expanding your capabilities, and giving you an always-available collaborator for both everyday tasks and specialized knowledge work.

    This sounds much more like the former, a "solve all your problems" machine.... not a plausible-sounding text generation machine.

    Only two weeks ago Sam Altman said their new data center "could" be where cancer gets cured[0]. It is only the people who deeply understand AI who see it as a text generator of plausible-sounding text. That isn't what the marketing department, the CEO, or the product itself seem to be saying. I'm using OpenAI as the example here, but the others don't seem much different.

    [0] https://www.youtube.com/watch?v=9-tOtbDDrJA

    • In this hypothetical case of a us being new users, you now know it's a conversational AI, so you continue asking:

      > Can I trust the output you give me?

      And I assume it explains what to trust VS not.

      I think in the bottom you should also see something like "Any text can contain mistakes" or similar too, which I know is a far cry from what some people push in the press in regards to capabilities, but I still don't see the platforms themselves as lying about this, while I do see a bunch of people constantly over-hyping the possibilities.

      1 reply →

  • The marketing materials are very much the former though. From claude.com:

    > If you can dream it, Claude can help you do it. Claude can process large amounts of information, brainstorm ideas, generate text and code, help you understand subjects, coach you through difficult situations, simplify your busywork so you can focus on what matters most, and so much more.

    What marketing copy have you read for LLMs that is like you mentioned?

    > But the way I'm being sold LLMs, is basically "A text generator that gives your plausible-sounding human text that sometimes hallucinates and gets things wrong, based on your input"

I would be very willing to pay more! The choice between “you may get a correct answer, or you may get lied to, without a clear way to distinguish between the two” and “you may get a correct answer, or a clear indication that the answer was not found” is pretty clear. One is a much more useful tool than the other. I don’t see any real incentives for companies making LLMs to keep their AI factually unreliable. (Full disclosure: I work for one, but I’m definitely not in the rooms where such decisions would be made.)

'I don't know' is the correct answer for infinitley more questions than those that can be answered.