Comment by ninetyninenine

2 months ago

>This is because the transformers are not able to take in or output actual text efficiently. Instead, the text is converted into numerical representations of itself, which is then contextualized to help the AI come up with a logical response. In other words, the AI might know that the tokens “straw” and “berry” make up “strawberry,” but it may not understand that “strawberry” is composed of the letters “s,” “t,” “r,” “a,” “w,” “b,” “e,” “r,” “r,” and “y,” in that specific order. Thus, it cannot tell you how many letters — let alone how many “r”s — appear in the word “strawberry.”

This is a great example. The LLM doesn't know something but it makes up something in it's place. Just because it made up something doesn't mean it's incapable of reasoning.

The thing with LLMs is that they can reason. There's evidence for that. But they can also be creative. And the line between reasoning and creativity at a low level is a bit of a blur as reasoning is a form of inference but so is creativity. So when an LLM reasons or gets creative or hallucinates it's ultimately doing the same type of thing: inference.

For us, we have mechanisms in our brain that allow us to tell the difference most of the time. The LLM does not. That's the fundamental line. And I feel because of this we are literally really close to AGI. A lot of people argue the opposite. They think reasoning and is core to intelligence and a separate concept from creativity and that all LLMs lack reasoning. I disagree.

In fact humans ourselves have trouble separating hallucination from reasoning. Look at religion. Religion permeates our culture but it's basically all hallucinations that we ultimately mistake for reasoning. Right? Ask any christian or muslim, the religions make rational sense to them! They can't tell the difference.

So the key is to give the LLM the ability to know the difference.

Is there some way to build into the transformer, some way to quantify whether something is fact or fiction? Like let's say the answer to a prompt created an inferenced datapoint that's very far far away from a cluster of data. From that we can derive some metric that quantifies how likely the response is based on evidence?

Right? The whole thing is on a big mathematical multidimensional durve. If the inferenced point on the curve is right next to existing data then it must be more likely to be true. If it's far away in some nether region of the curve then it's more likely to be false.

If the LLM can be more self aware and we can build this quantitative metric into the network then use reinforcement learning to kind of have the network be less sure about an answer if it's far away from a cluster of training data points we can likely very much improve the hallucination problem.

Of course I'm sure this is a blunt instrument as even false inferences data can be very close to existing training data. But at least this gives the LLM some level of self awareness of how reliable it's own answer was.