Comment by willvarfar
8 hours ago
Doing analysis on small models or small data is perfectly valid if the results extrapolate to large models. Which is why right now we're looking at new research papers that are still listing the same small datasets and comparing to the same small models that papers five years ago did.
I have nothing against researching this, I think it's important. My main issue is with articles choosing to grab a "conclusion" and imply it extrapolates to larger models, without any support for that. They are going for the catchy title first, fine-print be damned.
I was just at the KDD conference and the general consensus agreed with this paper. There was only one keynoter who just made the assumption that LLMs are associated with reasoning, which was jarring as the previous keynoter had just explained at length why we need a neuro-symbolic approach instead.
The thing is, I think the current companies making LLMs are _not_ trying to be correct or right. They are just trying to hide it better. In the business future for AI the coding stuff that we focus on on HN - how AI can help/impact us - is just a sideline.
The huge-money business future of LLMs is to end consumers not creators and it is product and opinion placement and their path to that is to friendship. They want their assistant to be your friend, then your best friend, then your only friend, then your lover. If the last 15 years of social media has been about discord and polarisation to get engagement, the next 15 will be about friendship and love even though that leads to isolation.
None of this needs the model to grow strong reasoning skills. That's not where the real money is. And CoT - whilst super great - is just as effective if it's hiding better that its giving you the wrong answer (by being more internally consistent) than if its giving you a better answer?
> None of this needs the model to grow strong reasoning skills. That's not where the real money is.
I never thought about it like that, but it sounds plausible.
However, I feel like getting to this stage is even harder to get right compared to reasoning?
Aside from the <0.1% of severely mentally unwell people which already imagine themselves to be in relationships with AIs, I don't think a lot of normal people will form lingering attachments to them without solving the issue of permanence and memory
They're currently essentially stateless, while that's surely enough for short term attachment, I'm not seeing this becoming a bigger issue because if that glaring shortfall.
It'd be like being in a relationship with a person with dementia, thats not a happy state of being.
Honestly, I think this trend is severely overstated until LLMs can sufficiently emulate memories and shared experiences. And that's still fundamentally impossible, just like "real" reasoning with understanding.
So I disagree after thinking about it more - emulated reasoning will likely have a bigger revenue stream via B2E applications compared to emotional attachment in B2C...
> None of this needs the model to grow strong reasoning skills. That's not where the real money is
"And the world is more and more complex, and the administrations are less and less prepared"
(~~ Henry Kissinger)
"as the previous keynoter had just explained at length why we need a neuro-symbolic approach instead"
Do you have a link to the video for that talk ?
1 reply →
As to general consensus, Hinton gave a recent talk, and he seemed adamant that neural networks (which LLMs are) really are doing reasoning. He gives his reasons for it. Is Hinton considered an outlier or?
2 replies →
Not sure what all this is about, I somewhat regret taking a breaking from coding with LLMs to have it explained to me its all a mirage and a secret and sloppy plan for getting me an automagic egirl or something. ;)
3 replies →
Because model size is a trivial parameter, and not a new paradigm.
What you're saying is like, you can't extrapolate that long division works on 100 digit numbers because you only worked through it using 7 digit numbers and a few small polynomials.
Scale changes the performance of LLMs.
Sometimes, we go so far as to say there is "emergence" of qualitative differences. But really, this is not necessary (and not proven to actually occur).
What is true is that the performance of LLMs at OOD tasks changes with scale.
So no, it's not the same as solving a math problem.
Alas, not true. It would be easier to predict progress if so.
This is 100% how it doesn't work with LLMs.
2 replies →
Aren't most major LLMs moving to an architecture where the model is made up of tons of smaller models?
There's a mountain of reasons why this makes sense from a cost perspective, and seemingly it does also for quality, too, as the newer models train substantially more cheaply and still outperform the older models.
Naively, this seems like it would be relevant.
The extrapolation doesn't work if the transformer is too shallow (too few layers) relative to sequence length, because of https://arxiv.org/abs/2503.03961 . A bunch of tasks become unfeasible when the layer count is too low, and 4 layers is way too low. I.e. linearly increasing the number of layers in a model can result in a superlinear increase in performance on tasks like reasoning.