← Back to context

Comment by starchild3001

6 months ago

> This assessment fits with my anecdotal evidence. LLMs just cannot reason in any basic way.

LLM reasoning is brittle and not like human cognition, but it is far from zero. It has demonstrably improved to a point where it can solve complex, multi-step problems across domains. See the numerous successful benchmarks and out of sample evals (livebench.ai, imo 2025, trackingai.ai IQ, matharena.ai etc).

I gained multiple months of productivity from vibe coding personally in 2025. If being able to correctly code a complex piece of software from a vague, single paragraph description isn't reasoning, what else is? Btw, I don't code UIs. I code complex mathematical algorithms, some of which never found in textbooks.

> LLMs have a large knowledge base that can be spit out at a moment notice. But they have zero insight on its contents, even when the information has just been asked a few lines before.

LLMs have excellent recall of recent information within their context window. While they lack human-like consciousness or "insight," their ability to synthesize and re-contextualize information from their vast knowledge base is a powerful capability that goes beyond simple data retrieval.

If anything LLMs show polymath-level ability to synthesize information across domains. How do I know? I use them everyday and get great mileage. It's very obvious.

> Most of the "intelligence" that LLMs show is just the ability to ask in the correct way the correct questions mirrored back to the user. That is why there is so many advice on how to do "proper prompting".

Prompting is the user interface for steering the model's intelligence. However, the model's ability to generate complex, novel, and functional outputs that far exceed the complexity of the input prompt shows that its "intelligence" is more than just a reflection of the user's query.

To summarize, I'm appalled by your statements, as a heavy user of SoTA LLMs on a daily basis for practically anything. I suspect you don't use them enough, and lack a viceral feel or scope for their capabilities.