Comment by Davidzheng
6 days ago
Sorry but to say current LLMs are a "dead end" is kind of insane if you compare with the previous records at general AI before LLMs. The earlier language models would be happy to be SOTA in 5 random benchmarks (like sentiment or some types of multiple choice questions) and SOTA otherwise consisted of some AIs that could play like 50 Atari games. And out of nowhere we have AI models that can do tasks which are not in training set, pass turing tests, tell jokes, and work out of box on robots. It's literally insane level of progress and even if current techniques don't get to full human-level, it will not have been a dead end in any sense.
Something can be much better than before but still be a dead end. Literally a dead end road can take you closer but never get you there.
But dead end to what? All progress eventually plateaus somewhere? It's clearly insanely useful in practice. And do you think there will be any future AGI whose development is not helped by current LLM technology? Even if the architecture is completely different the ability of LLMs to understand humans data automatically is unparalleled.
To reaching AI that can reason. And sure, as I wrote, large language models might become a relevant component for processing natural language inputs and outputs, but I do not see a path towards large language models becoming able to reason without some fundamentally new ideas. At the moment we try to paper over this deficit by giving large language model access to all kind of external tools like search engines, compilers, theorem provers, and so on.
1 reply →
You're in a bubble. Anyone who is responsible for making decisions and not just generating text for a living has more trouble seeing what is "insanely useful" about language models.
3 replies →
> the ability of LLMs to understand
But it doesn't understand. Its just similarity and next likely token search. The trick is that turns out to be useful or pleasing when tuned well enough.
1 reply →
I think large language models have essentially zero reasoning capacity. Train a large language model without exposing it to some topic, say mathematics, during training. Now expose the model to mathematics, feed it basic school books and explanations and exercises just like a teacher would teach mathematics to children in school. I think the model would not be able to learn mathematics this way to any meaningful extend.
Current generation of LLMs have very limited ability to learn new skills at inference time. I disagree this means they cannot reason. I think reasoning is by an large a skill which can be taught at training time.
Do you have an example of some reasoning ability any of the large language models has learned? Or do you just mean that you think, we could train them in principle?
1 reply →