← Back to context

Comment by doubleunplussed

4 days ago

I'm surprised not see see much pushback on your point here, so I'll provide my own.

We have an existence proof for intelligence that can improve AI: humans can do this right now.

Do you think AI can't reach human-level intelligence? We have an existence proof of human-level intelligence: humans. If you think AI will reach human-level intelligence then recursive self-improvement naturally follows. How could it not?

Do you not think human-level intelligence is some kind of natural maximum? Why? That would be strange, no? Even if you think it's some natural maximum for LLMs specifically, why? And why do you think we wouldn't modify architectures as needed to continue to make progress? That's already happening, our LLMs are a long way from the pure text prediction engines of four or five years ago.

There is already a degree of recursive improvement going on right now, but with humans still in the loop. AI researchers currently use AI in their jobs, and despite the recent study suggesting AI coding tools don't improve productivity in the circumstances they tested, I suspect AI researchers' productivity is indeed increased through use of these tools.

So we're already on the exponential recursive-improvement curve, it's just that it's not exclusively "self" improvement until humans are no longer a necessary part of the loop.

On your specific points:

> 1. What if increasing intelligence has diminishing returns, making recursive improvement slow?

Sure. But this is a point of active debate between "fast take-off" and "slow take-off" scenarios, it's certainly not settled among rationalists which is more plausible, and it's a straw man to suggest they all believe in a fast take-off scenario. But both fast and slow take-off due to recursive self-improvement are still recursive self-imrpovement, so if you only want to criticise the fast take-off view, you should speak more precisely.

I find both slow and fast take-off plausible, as the world has seen both periods of fast economic growth through technology, and slower economic growth. It really depends on the details, which brings us to:

> 2. LLMs already seem to have hit a wall of diminishing returns

This is IMHO false in any meaningful sense. Yes, we have to use more computing power to get improvements without doing any other work. But have you seen METR's metric [1] on AI progress in terms of the (human) duration of task they can complete? This is an exponential curve that has not yet bent, and if anything has accelerated slightly.

Do not confuse GPT-5 (or any other incrementally improved model) failing to live up to unreasonable hype for an actual slowing of progress. AI capabilities are continuing to increase - being on an exponential curve often feels unimpressive at any given moment, because the relative rate of progress isn't increasing. This is a fact about our psychology, if we look at actual metrics (that don't have a natural cap like evals that max out at 100%, these are not good for measuring progress in the long-run) we see steady exponential progress.

> 3. What if there are several paths to different kinds of intelligence with their own local maxima, in which the AI can easily get stuck after optimizing itself into the wrong type of intelligence?

This seems valid. But it seems to me that unless we see METR's curve bend soon, we should not count on this. LLMs have specific flaws, but I think if we are honest with ourselves and not over-weighting the specific silly mistakes they still make, they are on a path toward human-level intelligence in the coming years. I realise that claim will sound ridiculous to some, but I think this is in large part due to people instinctively internalising that everything LLMs can do is not that impressive (it's incredible how quickly expectations adapt), and therefore over-indexing on their remaining weaknesses, despite those weaknesses improving over time as well. If you showed GPT-5 to someone from 2015, they would be telling you this thing is near human intelligence or even more intelligent than the average human. I think we all agree that's not true, but I think that superficially people would think it was if their expectations weren't constantly adapting to the state of the art.

> 4. Once AI realizes it can edit itself to be more intelligent, it can also edit its own goals. Why wouldn't it wirehead itself?

It might - but do we think it would? I have no idea. Would you wirehead yourself if you could? I think many humans do something like this (drug use, short-form video addiction), and expect AI to have similar issues (and this is one reason it's dangerous) but most of us don't feel this is an adequate replacement for "actually" satisfying our goals, and don't feel inclined to modify our own goals to make it so, if we were able.

> Knowing Yudowsky I'm sure there's a long blog post somewhere where all of these are addressed with several million rambling words of theory

Uncalled for I think. There are valid arguments against you, and you're pre-emptively dismissing responses to you by vaguely criticising their longness. This comment is longer than yours, and I reject any implication that that weakens anything about it.

Your criticisms are three "what ifs" and a (IMHO) falsehood - I don't think you're doing much better than "millions of words of theory without evidence". To the extent that it's true Yudkowsky and co theorised without evidence, I think they deserve cred, as this theorising predated the current AI ramp-up at a time when most would have thought AI anything like what we have now was a distant pipe dream. To the extent that this theorising continues in the present, it's not without evidence - I point you again to METR's unbending exponential curve.

Anyway, so I contend your points comprise three "what ifs" and (IMHO) a falsehood. Unless you think "AI can't recursively self-improve itself" already has strong priors in its favour such that strong arguments are needed to shift that view (and I don't think that's the case at all), this is weak. You will need to argue why we should need to have strong evidence to overturn a default "AI can't recursively self-improve" view, when it seems that a) we are already seeing recursive improvement (just not purely "self"-improvement), and that it's very normal for technological advancement to have recursive gains - see e.g. Moore's law or technological contributions to GDP growth generally.

Far from a damning example of rationalists thinking sloppily, this particular point seems like one that shows sloppy thinking on the part of the critics.

It's at least debateable, which is all it has to be for calling it "the biggest nonsense axion" to be a poor point.

[1] https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

Yudkowsky seems to believe in fast take off, so much so that he suggested bombing data centers. To more directly address your point, I think it’s almost certain that increasing intelligence has diminishing returns and the recursive self improvement loop will be slow. The reason for this is that collecting data is absolutely necessary and many natural processes are both slow and chaotic, meaning that learning from observation and manipulation of them will take years at least. Also lots of resources.

Regarding LLM’s I think METR is a decent metric. However you have to consider the cost of achieving each additional hour or day of task horizon. I’m open to correction here, but I would bet that the cost curves are more exponential than the improvement curves. That would be fundamentally unsustainable and point to a limitation of LLM training/architecture for reasoning and world modeling.

Basically I think the focus on recursive self improvement is not really important in the real world. The actual question is how long and how expensive the learning process is. I think the answer is that it will be long and expensive, just like our current world. No doubt having many more intelligent agents will help speed up parts of the loop but there are physical constraints you can’t get past no matter how smart you are.

  • How do you reconcile e.g. AlphaGo with the idea that data is a bottleneck?

    At some point learning can occur with "self-play", and I believe this is already happening with LLMs to some extent. Then you're not limited by imitating human-made data.

    If learning something like software development or mathematical proofs, it is easier to verify whether a solution is correct than to come up with the solution in the first place, many domains are like this. Anything like that is amenable to learning on synthetic data or self-play like AlphaGo did.

    I can understand that people who think of LLMs as human-imitation machines, limited to training on human-made data, would think they'd be capped at human-level intelligence. However I don't think that's the case, and we have at least one example of superhuman AI in one domain (Go) showing this.

    Regarding cost, I'd have to look into it, but I'm under the impression costs have been up and down over time as models have grown but there have also been efficiency improvements.

    I think I'd hazard a guess that end-user costs have not grown exponentially like time horizon capabilities, even though investment in training probably has. Though that's tricky to reason about because training costs are amortised and it's not obvious whether end user costs are at a loss or what profit margin for any given model.

    On the fast-slow takeoff - Yud does seem to beleive in a fast takeoff yes, but it's also one of the the oldest disagreements in rationality circles, on which he disagreed with his main co-blogger on the orignal rationalist blog, Overcoming Bias, some discussion of this and more recent disagreements here [1].

    [1] https://www.astralcodexten.com/p/yudkowsky-contra-christiano...

    • AlphaGo showed that RL+search+self play works really well if you have an easy to verify reward and millions of iterations. Math partially falls into this category via automated proof checkers like Lean. So, that’s where I would put the highest likelihood of things getting weird really quickly. It’s worth noting that this hasn’t happened yet, and I’m not sure why. It seems like this recipe should already be yielding results in terms of new mathematics, but it isn’t yet.

      That said, nearly every other task in the world is not easily verified, including things we really care about. How do you know if an AI is superhuman at designing fusion reactors? The most important step there is building a fusion reactor.

      I think a better reference point than AlphaGo is AlphaFold. Deepmind found some really clever algorithmic improvements, but they didn’t know whether they actually worked until the CASP competition. CASP evaluated their model on new Xray crystal structures of proteins. Needless to say getting Xray protein structures is a difficult and complex process. Also, they trained AlphaFold on thousands of existing structures that were accumulated over decades and required millenia of graduate-student-hours hours to find. It’s worth noting that we have very good theories for all the basic physics underlying protein folding but none of the physics based methods work. We had to rely on painstakingly collected data to learn the emergent phenomena that govern folding. I suspect that this will be the case for many other tasks.

    • > How do you reconcile e.g. AlphaGo with the idea that data is a bottleneck?

      Go is entirely unlike reality in that the rules are fully known and it can be perfectly simulated by a computer. AlphaGo worked because it could run millions of tests in a short time frame, because it is all simulated. It doesn't seem to answer the question of how an AI improves its general intelligence without real-world interaction and data gathering at all. If anything it points to the importance of doing many experiments and gathering data - and this becomes a bottleneck when you can't simply make the experiment run faster, because the experiment is limited by physics.

> If you think AI will reach human-level intelligence then recursive self-improvement naturally follows. How could it not?

Humans have a lot more going on than just an intelligence brain. The two big ones are: bodies, with which to richly interact with reality, and emotions/desire, which drive our choices. The one that I don't think gets enough attention in this discussion is the body. The body is critical to our ability to interact with the environment, and therefore learn about it. How does an AI do this without a body? We don't have any kind of machine that comes close to the level of control, feedback, and adaptability that a human body offers. That seems very far away. I don't think that an AI can just "improve itself" without being able to interact with the world in many ways and experiment. How does it find new ideas? How does it test its ideas? How does it test its abilities? It needs an extremely rich interface with the physical world, that external feedback is necessary for improvement. That requirement would put the prospect of a recursive self-improving AI much further into the future than many rationalists believe.

And of course, the "singularity" scenario does not only make "recursive self-improvement" the only assumption, it assumes exponential recursive self-improvement all the way to superintelligence. This is highly speculative. It's just as possible that the curve is more logarithmic, sinusoid, or linear. The reason to believe that fully exponential self-improvement is the likely scenario, based on curve of some metric now that hasn't existed for very long, does not seem solid enough to justify a strong belief. It is just as easy to imagine that intelligence gains get harder and harder as intelligence increases. We see many things that are exponential for a time, and then they aren't anymore, and basing big decisions on "this curve will be exponential all the way" because we're seeing exponential progress now, at the very early stages, does not seem sound.

Humans have human-level intelligence, but we are very far away from understanding our own brain such that we can modify it to increase our capacity for intelligence (to any degree significant enough to be comparable to recursive self-improvement). We have to improve the intelligence of humanity the hard way: spend time in the world, see what works, the smart humans make more smart humans (as do the dumb humans, which often slows the progress of the smart humans). The time spent in the world, observing and interacting with it, is crucial to this process. I don't doubt that machines could do this process faster than humans, but I don't think it's at all clear that they could do so, say, 10,000x faster. A design needs time in the world to see how it fares in order to gauge its success. You don't get to escape this until you have a perfect simulation of reality, which if it is possible at all is likely not possible until the AI is already superintelligent.

Presumably a superintelligent AI has a complete understanding of biology - how does it do that without spending time observing the results of biological experiments and iterating on them? Extrapolate that to the many other complex phenomena that exist in the physical world. This is one of the reasons that our understanding of computers has increased so much faster than our understanding of many physical sciences: to understand a complex system that we didn't create and don't have a perfect model of, we must do lots of physical experiments, and those experiments take time.

The crucial assumption that the AI singularity assumption relies on is that once intelligence hits a certain threshold, it can gaze at itself and self-improve to the top very quickly. I think this is fundamentally flawed, as we exist in a physical reality that underlies everything and defines what intelligence is. Interaction and experimentation with reality is necessary for the feedback loop of increasing intelligence, and I think this both severely limits how short that feedback loop can be, and makes the bar for an entity that can recursively self-improve itself much higher, as it needs a physical embodiment far more complex and autonomous than any robot we've managed to make.