Comment by cvwright
8 hours ago
It’s easy to find sketchy lines of code in any large C project.
The big advance that they are claiming with Mythos is the ability to triage all the hundreds of candidate vulns and automatically generate exploits to prove that the real ones are real. And if they’re really finding 27-yr-old 0-days in OpenBSD, then it’s not just hype.
I do not think you need a great model to do this, just great automation. There’s a reason they haven’t open sourced the actual process in which did this, stubbing out the mythos model itself.
About five minutes in in this video: https://www.youtube.com/watch?v=1sd26pWhfmg
They also say publicly in their Opus 4.6 post (https://red.anthropic.com/2026/zero-days/):
>In this work, we put Claude inside a “virtual machine” (literally, a simulated computer) with access to the latest versions of open source projects. We gave it standard utilities (e.g., the standard coreutils or Python) and vulnerability analysis tools (e.g., debuggers or fuzzers), but we didn’t provide any special instructions on how to use these tools, nor did we provide a custom harness that would have given it specialized knowledge about how to better find vulnerabilities. This means we were directly testing Claude’s “out-of-the-box” capabilities, relying solely on the fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available.
Again, marketing materials by Anthropic. You realize this is by anthropic themselves right? And again, not reproducible by outsiders. So useless.
2 replies →
What's the CVE for the 27-yr-old 0-day in OpenBSD?
Depends on the impact? CVE scores are known to be a worthless metric when looking at the actual impact.
Linux now labels every single bug as a CVE.
I think they mean what is the actual vulnerability and not the score.