Comment by FrojoS
6 hours ago
> there's no reason to believe the progress of LLMs [...] will stop anytime soon
Wrong. Every advancement has followed a s curve. Where we are on that curve is anyones guess. Or maybe "this time its different".
6 hours ago
> there's no reason to believe the progress of LLMs [...] will stop anytime soon
Wrong. Every advancement has followed a s curve. Where we are on that curve is anyones guess. Or maybe "this time its different".
Great. You see a shape in graphs. And that shape tells you that _at some unknown point in the future_ progress will slow (but likely not stop).
Now back to the point, what reason do you have to believe progress will stop soon? If you have no reason, then it sounds like you agree with OP.
Which makes the patronizing sarcasm all that much more nauseating.
Nausea aside, what evidence does anyone have that “super intelligence” of the sort your argument alludes to is even possible? Because that’s what we’re really talking about; greater than human intelligence on this sort of academic task. For example; When llms start contributing meaningfully to their own development, that would be a convincing indicator imo.
This discussion is not about superintelligence, it is about continued progress. Fully general human intelligence at much lower cost than humans is all that is required to profoundly reshape society, but it is not clear even that will happen soon.
As the blog points out - this is one particular subfield where LLMs have much easier prospects - lots of low hanging fruit that “just” requires a couple weeks of PHD candidate research.
Mathematics itself is one of a small handful of endeavors where automated reinforcement training is extremely straightforward and can be done at massive scale without humans.
Neither of these factors place a structural bound on the kind of thing LLMs can be good at, but we are far from certain we can achieve performance at this level in other fields economically and in the near future.
> When llms start contributing meaningfully to their own development, that would be a convincing indicator imo.
This has been the case for awhile now already…
https://kersai.com/the-48-hours-that-changed-ai-forever-clau...
2 replies →
It’s more of a guess if you don’t know about things like scaling laws and RL with verification. The onus of “we’re going to saturate” anytime soon is on that claim because every measurement points to that not being true.
Yeah. People (Gary Marcus) have been claiming that AI will hit a wall or is hitting a wall or already has hit a wall since 2023, basically. And yet every time they proclaim that the AI industry found new ways of training their AI's, new ways of integrating them with external tools and feedback loops, new architectures and more to keep the exponential growing. And sure enough if you look at literally every attempt to objectively rate and verify the capability of these models, including things like the METR time horizon autonomy index or the artificial analysis intelligence index, you see exponential or even greater than exponential growth, continuing smoothly through each of the points people claimed that it would begin to slow down, with no sinus slowing down or stopping at all. So yeah, I think at some point the onus has to lie on the ones that are making the claim that keeps being wrong and the continues to be wrong and it completely goes against the current tangent of the curve that we're seeing in all objective metrics. Especially when they can't give specific new reasons for progress to stop beyond the ones they gave last time. It didn't stop and really can't give specific reasons at all besides vague general points about stochastic parrots and S curves.
I really have to highlight the S-curve nonsense because, like, yes, I think this technology's improvement will follow an S-curve. It's absurd to think that it will just follow an exponential up towards infinity forever because nothing in the world really works like that. However, like everyone else in this thread is saying, we have no idea where on the S-curve we actually are, and it's impossible to know until it's already slowed down. So really all appeals to the S curve do are as function as a sort of non-specific, unfalsifiable prophecy that someday it will slow down, which doesn't really tell us anything useful, and also frees the person referencing the S curve from ever actually having to worry about being wrong. Just like the Singularity people, the slowdown of the S curve is always near. This is actually a known and well-established tactic of religions and other people that want to make prophecies without having to worry about turning out to be wrong — unfalseifiable vague prophecies with no actual timeline, and thus no clear import to the present so that they can never be shown to be wrong.
There are advancements that do not follow s curves - consider for instance total data transmitted over all networks, or financial derivatives volumes.
I think a better question for AI is “is it more like a network effect, liquidity effect, or a biological/physical effect”?
Those are measuring the utility of a technological advancement by looking at usage, not the pace of advancement of said technology.
Yes. But quantity has a quality all its own, as they say — derivatives have gone through at least a few step functions where they have become more important and more useful as their usage grows. I’d call that advancement.
Maybe just to be clear I think that kneejerk “I hate this AI trend, and prefer to believe this will end soon, all exponential growth ends eventually” is intellectually lazy, and dangerous for younger engineers/hackers, a group I hope can benefit from being on HN.
Bitcoin mining went through something like 13 10x growth periods, last I ran the numbers a few years ago. There are physical processes that do have very extended periods of doubling, and there are digital and financial processes that don’t show any signs of doing anything but continuing to keep growing over their multidecade lives. So, like I said, it’s worth thinking carefully, and risk mitigation for things like mental health, career decisions and investment decisions indicates we should be cautious assessing new dynamics.
Got him. That guy always posts with so much bluster lmao.
>There are advancements that do not follow s curves - consider for instance total data transmitted over all networks, or financial derivatives volumes
Or Roman trade volume before the Fall of Rome.
Not to mention what you describe is not technological improvement but increase in data or money flows, not the same.
Sic transit gloria - obviously.
But I don’t that think it’s quite so obvious that model quality / growth / usefulness is definitively and obviously not more like data or money flows than it is like some other process.
Total volume of usage is not an advancement, it’s orthogonal.
Indeed, and it's more linked with market penetration than technological advancement. It's like evaluating airplane technology by "total miles flown".
[dead]
What people miss is that AI isn't one S curve, each capability we try to bake into a model has its own S curve. Model progress might not impact some capabilities at all, but other capabilities might get totally overhauled.
This could be right for the current architecture of LLMs, but you can come up with specialized large language models that can more efficiently use tokens for a specific subset of problems by encoding the information differently (https://www.nature.com/articles/d41586-024-03214-7).
So if instead of text we come up with a different representation for mathematical or physical problems, that could both improve the quality of the output while reducing the amount of transformers needed for decoding and encoding IO and for internal reasoning.
There are also difference inference methods, like autoregressive and diffusion, and maybe others we haven't discovered yet.
You combine those variables, along with the internal disposition of layers, parameter size and the actual dataset, and you have such a large search space for different models that no one can reliably tell if LLM performance is going to flatline or continue to improve exponentially.
> So if instead of text we come up with a different representation for mathematical or physical problems, that could both improve
But then, wouldn't we first have to translate all of our current math and physics knowledge into that new representation in order to be able to train a model on it? Looks like a tremendous amount of work to me.
Yes, but by then you already have general LLMs capable of helping with the work. And even if you didn't, if that's what it would take to advance research in these fields, that would be a justifiable effort.
>This could be right for the current architecture of LLMs, but you can come up with specialized large language models that can more efficiently use tokens for a specific subset of problems by encoding the information differently.
That's precisely what happens on the bad side of a S curve.
Progress don't stop however, and the S curve resets, because then you are optimizing a new architecture.
He said "will stop anytime soon". He didn't say forever.
Which still makes no sense. There is the same chance we are flatlining now as that we are flatlining in e.g. 3 years or 5 years.
In what sense are the models flatlining?
6 replies →
It can be S curve (and it almost surely is), but on every chart you can plot, you don't see even of an inkling of the bend yet.
Software and hardware have no limits. Theoretically would could bozons for computations and have the same amount of computation available on one cm3 of the current total computation in the entire world. Same with software. Never there was a stop on new algorithms. With LLMs there are so many parts that will get better and are not very far fetched.
What the fuck does that have to do with “soon”?
This is FUD and extremely wrong. None of the advancements have followed an S curve. This time IS different and it should be obvious to you at this point.