← Back to context

Comment by dataflow

7 hours ago

That's great and all but how severe were the most severe vulnerabilities found? I imagine they don't want to talk about it, but that's really the most interesting and important bit.

As much as I’d like to share in the skepticism, the very beginning of the article states it very plainly — this is a step function.

Lots of people feel that Mythos is a psyops campaign, but I don’t really understand the skepticism. Most of it seems to stem from the general distrust of things that aren’t publicly available.

A few Anthropic employees have described Mythos as a general purpose model improvement, but that claim has yet to be widely backed up so that’s the only place I’m remaining skeptical.

For the domain of security research, I’m willing to buy the narrative.

  • In his interview on the Hard Fork podcast, Palo Alto Networks’ CEO described the capability change from Opus to Mythos being more about availability; evidently it runs in a very compute-intensive, always-on mode. Unclear if the base model is significantly different, but Arora ascribed the difference mostly to that change.

  • > As much as I’d like to share in the skepticism, the very beginning of the article states it very plainly — this is a step function.

    To be fair, they can't say "You know, Mythos is better, but improvements are overhyped af". Moreover, their explanation of that "step change" is strange. It sounds like Mythos isn't that much better at finding vulnerabilities (which is very strange, given statements from Mozilla), but is way stronger at working with them.

    > Lots of people feel that Mythos is a psyops campaign, but I don’t really understand the skepticism. Most of it seems to stem from the general distrust of things that aren’t publicly available.

    1) Attempts to spin the idea about "Super powerful general purpose model that can't be released for some not so clear reasons" are usually a very bad sign. OpenAI proves it.

    2) Mythos system card has a lot of strange moments, errors and things that sound like attempts to deceive.

    3) It's strange that Anthropic is struggling with both Sonnet 5.0 and Opus 5.0, but at the same time has a breakthrough in the form of Mythos.

    > A few Anthropic employees have described Mythos as a general purpose model improvement, but that claim has yet to be widely backed up so that’s the only place I’m remaining skeptical.

    Article describes Mythos as a cybersecurity-specific model though. It's yet another unclear moment.

  • A general distrust of things that aren't publicly available is very healthy. We should all do more of that!

    Honest question, do you buy the narrative of everyone trying to sell you a product?

  • > As much as I’d like to share in the skepticism, the very beginning of the article states it very plainly — this is a step function.

    That's great and all, but nobody was being skeptical or asking anything about whether Mythos is or isn't a step function. Mythos could be a ten-dimensional ladder and it wouldn't change my question. The question wasn't about Mythos, but about Cloudflare: what did they found? That question is entirely fair and expected regardless of whether vulnerabilities are found via Mythos, the NSA, or a caveman.

I've settled in on the opinion that it's much more creative and able to run agentically for longer periods of time. So, despite it not having drastically better "hard skills", it's able to combine those together in more effective ways.

Right now, many of these vulns are identifiable by Opus, but they still require a human-in-the-loop (and often a skilled one) to guide towards complex exploits. Without a human in the loop, this means it's a lot easy for the average person to identify and leverage an exploit.

Most of their new products are AI tools that nobody uses, so I guess they’ll keep posting slop. And recently, they’ve fired so many people that they probably don’t have good writers anymore.