Comment by frisbee6152

4 hours ago

The most important thing I would point to is Mythos et al and the wave of vulnerabilities that have been discovered in the past couple months. It’s a completely unprecedented event, brought forth almost entirely by improvements in the models themselves. That said. keep in mind, I’m talking about over the past two years. With Claude code and the capabilities gained since December of last year, there have been incredible gains in the capabilities that are now available. Demand for inference is higher now than it was a year ago, because capability has improved. A specific criticism that I would hold is that claiming that progress with LLMs is slowing, prior to that point, is embarrassingly wrong in my view. One could argue that the model capability improvements are slowing, and all the improvements were in harnesses. I think that’s a stronger argument, but I have a few problems with it. 1. Utility is utility. Whether that comes from the model or the harness is irrelevant when making claims about utility. I don’t think that’s a useful distinction most of the time, but especially when talking about the technology as a whole. 2. Marginal intelligence gain is different than marginal utility gain. It’s estimated that intelligence grows logarithmically relative to investment. However, the utility of a marginally more intelligent model may grow exponentially, because once behavior crosses a reliability threshold, it unlocks new capabilities. 3. Even on those terms, it’s not clear to me that frontier capabilities are slowing down. With Mythos and its contemporaries, we have been seeing a vast change in the security industry as vulnerabilities are discovered at an unprecedented rate. OpenBSD vulnerabilities, more Firefox vulnerabilities found in a single month than the past two years, critical Linux vulnerabilities. It’s hard for me to look at the effects there, a radical new capabilities baked into the model itself, and see stagnation. A part of the reason it might feel like it’s slowing down is because we plebs don’t have access to the top models.

6 comments

frisbee6152

lompad 3 hours ago

The maintainer of curl - who has access to mythos - disagrees [0].

I think it's dangerous to rely on claims made by people who financially profit from you believing them without checking.

[0]: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...

jsnell 3 hours ago
That blog post is very clear about the maintainer having no access to Mythos.
- IsTom 2 hours ago
  
  Does that matter that somebody else ran it for him?
frisbee6152 2 hours ago

The article says in the second section that the author did not have access to Mythos. I think it’s dangerous to rely on claims made by others without even bothering to read them first, let alone check.
It found hundreds of vulnerabilities in Firefox, according to Mozilla: how does Mozilla benefit? It found a 27 year old vulnerability in OpenBSD. How do they benefit from that? Is that made up? Are the maintainers of those codebases lying for the benefit of Anthropic’s IPO? Is copy fail a fabrication by big AI? The 12 OpenSSL vulnerabilities found in January?
https://venturebeat.com/security/mythos-detection-ceiling-se... https://www.wired.com/story/mozilla-used-anthropics-mythos-t... https://cyberscoop.com/copy-fail-linux-vulnerability-artific... https://www.schneier.com/blog/archives/2026/02/ai-found-twel...
Im not sure whose claims you think I’m relying on. I trust Firefox that they’re not overstating the number of CVES they’ve found. Same for OpenSSL. The OpenBSD folks definitely don’t seem like the types. I’ve not known Linux to fabricate CVEs either. I think my sources are fine.

slopinthebag 4 hours ago

Do you have access to Mythos?

frisbee6152 2 hours ago

Nope. Just watching the volume and severity of CVEs coming through since it’s been running. It’s been a busy few months.