← Back to context

Comment by frisbee6152

4 hours ago

The most important thing I would point to is Mythos et al and the wave of vulnerabilities that have been discovered in the past couple months. It’s a completely unprecedented event, brought forth almost entirely by improvements in the models themselves. That said. keep in mind, I’m talking about over the past two years. With Claude code and the capabilities gained since December of last year, there have been incredible gains in the capabilities that are now available. Demand for inference is higher now than it was a year ago, because capability has improved. A specific criticism that I would hold is that claiming that progress with LLMs is slowing, prior to that point, is embarrassingly wrong in my view. One could argue that the model capability improvements are slowing, and all the improvements were in harnesses. I think that’s a stronger argument, but I have a few problems with it. 1. Utility is utility. Whether that comes from the model or the harness is irrelevant when making claims about utility. I don’t think that’s a useful distinction most of the time, but especially when talking about the technology as a whole. 2. Marginal intelligence gain is different than marginal utility gain. It’s estimated that intelligence grows logarithmically relative to investment. However, the utility of a marginally more intelligent model may grow exponentially, because once behavior crosses a reliability threshold, it unlocks new capabilities. 3. Even on those terms, it’s not clear to me that frontier capabilities are slowing down. With Mythos and its contemporaries, we have been seeing a vast change in the security industry as vulnerabilities are discovered at an unprecedented rate. OpenBSD vulnerabilities, more Firefox vulnerabilities found in a single month than the past two years, critical Linux vulnerabilities. It’s hard for me to look at the effects there, a radical new capabilities baked into the model itself, and see stagnation. A part of the reason it might feel like it’s slowing down is because we plebs don’t have access to the top models.

The maintainer of curl - who has access to mythos - disagrees [0].

I think it's dangerous to rely on claims made by people who financially profit from you believing them without checking.

[0]: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...