Comment by tavavex
10 days ago
I think you're confusing OP for the people who claim that there is zero functional difference between an LLM and a search engine that just parrots stuff already in it. But they never made such a claim. Here, let me try: the simplest explanation for how next token estimation leads to a model that often produces true answers is that for most inputs, the most likely next token is true. Given their size and the way they're trained, LLMs obviously don't just ingest training data like a big archive, they contain something like an abstract representation of tokens and concepts. While not exactly like human knowledge, the network is large and deep enough that LLMs are capable of predicting true statements based on preceding text. This also enables them to answer questions not in their training dataset, although accuracy obviously suffers the further you deviate from known topics. The most likely next token to any question is the true answer, so they essentially ended up being trained to estimate truth.
I'm not saying this is bad or underwhelming, by the way. It's incredible how far people were able to push machine learning with just the knowledge we have now, and how they're still making process. I'm just saying it's not magic. It's not something like an unsolved problem in mathematics.
No one ever made the claim it was magic, not even remotely. Regarding the rest of your commentary: a) The original claim was that LLM's were not understood and are a black box. b) Then someone claims that this is not true, and they know well how LLM's work, it is simply due to questions & answers being in close textual proximity in training data. c) I then claim this is a shallow explanation because you then need to invoke additionally a huge abstraction network - that is a black box, d) you seem to agree with this while at the same time saying I misrepresented "b" - which I don't think I did. They really claimed they understood it and only offered this textual proximity thing.
In general, every attempt at explanation of LLM's that appeal to "[just] predicting next token" is thought terminating and automatically invalid as explanation. Why? Because it is confusing the objective function with the result. It adds exactly zero over saying "I know how a chess engine works, it just predicts the next move and has been trained to predict the next move" or "A talking human just predicts the next word, as it was trained to do". It says zero about how this is done internally in the model. You could have a physical black box predicting the next token, and inside you could have simple frequentist tables or you could have a human brain or you could have an LLM. In all cases you could say the box is predicting the next token and if any training was involved you could say it was trained to predict the next token.