Comment by Frieren
6 months ago
This assessment fits with my anecdotal evidence. LLMs just cannot reason in any basic way.
LLMs have a large knowledge base that can be spit out at a moment notice. But they have zero insight on its contents, even when the information has just been asked a few lines before.
Most of the "intelligence" that LLMs show is just the ability to ask in the correct way the correct questions mirrored back to the user. That is why there is so many advice on how to do "proper prompting".
That and the fact that most questions have already been asked before as anyone that spend some time in StackOverflow back in the day realized. And memory and not reasoning is what is needed to answer them.
> This assessment fits with my anecdotal evidence. LLMs just cannot reason in any basic way.
LLM reasoning is brittle and not like human cognition, but it is far from zero. It has demonstrably improved to a point where it can solve complex, multi-step problems across domains. See the numerous successful benchmarks and out of sample evals (livebench.ai, imo 2025, trackingai.ai IQ, matharena.ai etc).
I gained multiple months of productivity from vibe coding personally in 2025. If being able to correctly code a complex piece of software from a vague, single paragraph description isn't reasoning, what else is? Btw, I don't code UIs. I code complex mathematical algorithms, some of which never found in textbooks.
> LLMs have a large knowledge base that can be spit out at a moment notice. But they have zero insight on its contents, even when the information has just been asked a few lines before.
LLMs have excellent recall of recent information within their context window. While they lack human-like consciousness or "insight," their ability to synthesize and re-contextualize information from their vast knowledge base is a powerful capability that goes beyond simple data retrieval.
If anything LLMs show polymath-level ability to synthesize information across domains. How do I know? I use them everyday and get great mileage. It's very obvious.
> Most of the "intelligence" that LLMs show is just the ability to ask in the correct way the correct questions mirrored back to the user. That is why there is so many advice on how to do "proper prompting".
Prompting is the user interface for steering the model's intelligence. However, the model's ability to generate complex, novel, and functional outputs that far exceed the complexity of the input prompt shows that its "intelligence" is more than just a reflection of the user's query.
To summarize, I'm appalled by your statements, as a heavy user of SoTA LLMs on a daily basis for practically anything. I suspect you don't use them enough, and lack a viceral feel or scope for their capabilities.
Please don't tell me you were one of those marking every SO question as duplicate, more often than not missing the entire nuance in the question that made it not a duplicate at all, and the answers to the so called previously asked question utterly unusable?
This was one of those infuriating things that drove so many away from SO and jump ship the second there was an alternative.
I'm not sure why duplicates were ever considered an issue. For certain subjects (like JS) things evolved so quickly during the height of SO that even a year old answer was outdated.
That and search engines seemed to promote more recent content.. so an old answer sank under the ocean of blog spam
SO wanted to avoid being a raw Q&A site in favor of something more like a wiki.
If a year-old answer on a canonical question is now incorrect, you edit it.
7 replies →
I was "playing" the gamification part of StackOverflow. I wanted to ask a good question for points. But it was very difficult because any meaningful question had already been asked. It was way easier to find questions to answer.
Every time I ask people for an example of this, and get one, I agree with the duplicate determination. Sometimes it requires a little skimming of the canonical answers past just the #1 accepted one; sometimes there's a heavily upvoted clarification in a top comment, but it's usually pretty reasonable.
>This assessment fits with my anecdotal evidence. LLMs just cannot reason in any basic way.
Agreed completely, and the sentiment seems to be spreading at an ever-increasing rate. I wonder how long it will be before the bubble collapses. I was thinking maybe as long as a few years, but it might be far sooner at this rate. All it will take is one of the large AI companies coming out and publicly stating that they're no longer making meaningful gains or some other way that shows the public what's really going on behind the curtain.
I'm certain the AI hype bubble will be studied for generations as the greatest mass delusion in history (so far).