Comment by Borealid
4 days ago
You're still using language that includes words like "recognize" which strongly suggest you haven't got the parent poster's point.
The model emits text. What it's emitted before is part of the input to the next text generation pass. Since the training data don't usually include much text saying one thing then afterwards saying "that was super stupid, actually it's this other way" the model also is unlikely to generate a new token saying the last one was irrational.
If you wanted to train a model to predict the next sentence would be a contradiction of the previous you could do that. "True" and "correct" and "recognize" are not in the picture.
LLMs can recognize errors in their own output. That's why thinking models generally perform much better than the non-thinking ones.
No, a block of text that begins "please improve on the following text:" is likely to continue after the included block with some text that sounds like a correction or refinement.
Nothing is "recognized", nor is anything "an error". Nothing is "thinking" any more than it would be if the LLM just printed whether the next letter were more likely to be a vowel or consonant. Just because it's doing a better job modeling text doesn't magically make it be doing something that's not a text prediction function.
You're using the same words again. It looks like reasoning, but it's a simulation.
The LLM merchants are driving it though, by using pre-existing words for things that are not what they are saying they are.
It's amazing what they can do, but an LLM cannot know if what it outputs is true or correct, just statistically likely.