← Back to context

Comment by 827a

5 hours ago

The second case is not what Anthropic did either, though. If you have their process internalized as "open freebsd, tell mythos 'find vulns', done" this is not what happened. They have a harness that went file-by-file, spawned a subagent for each file, told it to find vulns in that file, then a post-processing step (more on that in a sec).

In that sense: The AISLE replication still provides too much information to the model, but its not far off, and others have replicated Mythos' findings in a more clandestine manner on open source models. Some were totally capable of finding the same vulns Mythos found back in ~March (and today, the new Kimi K2.7 is looking extremely good, very little doubt it could do it).

The critical difference is that post-processing: the Mythos model/harness has some step to induce Mythos to actually exploit the vulnerability, leveraging its ability to do so as a ranking mechanism. Anthropic inferred that this led Mythos to discover vulnerabilities nothing else could discover, which is not true, and Anthropic should be held accountable for this weird artifact of that communication. However:

- An OSS model might find the vulnerability but rank it as a 3/10. Mythos finds it, chains it with a second vulnerability, now suddenly its an 8/10.

- An OSS model might find the vulnerability, alongside fifty other vulnerabilities. The operator ignores all of them.

The problem with automated vulnerability detection, including with LLMs, is that they find the haystack, not the needle. Every piece of hay might be a vulnerability, but whether its worthy of fixing is another matter. Mythos does represent a meaningful improvement; it better finds the needle.

Even if some models are able to find some large fraction of the vulns that Mythos finds, the contention (that those of us outside Anthropic and a few select partners do not yet have the ability to replicate) is that Mythos not only found vulns, but was able to string them together into working exploits, very close to autonomously or at list with minimal hand-holding. I believe that UK AISI has at least in part confirmed that Mythos is better at this than other models like GPT 5.5, even while it's not better at the simply finding vulns part (although, again, no one else is currently able to replicate their results).

This was the primary reason for not releasing it. The difference in the two primary camps around this topic are that the doubter group thinks that Anthropic, and all of their partners, are essentially lying about this (since no one outside Anthropic and select partners has the access to replicate), whereas the other side believes that Anthropic and partners are probably mostly telling the truth without too much exaggeration.

Neither camp has evidence other than unconfirmable reports and/or arguments about economic incentives. I personally think that Anthropic has, in the past, mostly not lied about things like this and has by far been the most transparent and open AI company. That could change, and they could be lying now, but I think that the camp that is certain that they are is far, far too confident in their belief.