Comment by roxolotl
6 hours ago
What does this mean?
> It's a different kind of tool doing a different kind of work, and that makes a clean apples-to-apples comparison to earlier models difficult.
They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.
> They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.
Hah, I was trying to parse this too.
Charitably perhaps they're being vague on exactly what's different because they're still under NDA.
> way worse than the average Cloudflare blog
How long has it been since you took your average? Lately all Cloudflare output has been heavily AI'd.
> the model has its own emergent guardrails that sometimes cause it to push back on legitimate security research requests. But as we found, these organic refusals aren’t consistent - the same task, framed differently or presented in a different context, could produce completely different outcomes as illustrated in the examples below.
This was new. I'm surprised that a model specifically designed for security research and gated to professionals is refusing legitimate requests
There's pretty strong evidence that (mis)alignment in one area creates (mis)alignment in others. The "aligned behavior" vectors are not orthogonal from cybersecurity to bioweapons to prejudice, so having alignment in some will likely bleed into others.
Sounds different because it’s hidden advertisement not a regular blog post
But why would cloudflare advertise Anthropic? They are competing with Anthropic by hosting open weights models.
https://www.cloudflare.com/press/press-releases/2025/cloudfl...
1 reply →
The post says they wrote a custom harness that orchestrates work between multiple separate model invocations. That is different from running Claude Code (which is a specific existing harness around the Claude models).
The post takes a while to get around to saying that, and could have included more detail besides the workflow diagram and table (which they flag as only "an example of" such a harness), but it does answer the question. It's a different kind of tool because it's a model rather than a harness+model pair.
'Its not X, its Y' is also a common LLM trope.
My guess is because it is a model trained specifically for security/hacking. So comparing it to Opus, trained for chat/code/etc., is apples-to-oranges.
It is not, that's what surprised Anthropic employees too.
I think what they might mean is:
Because of it's capabilities, a new kind of harness can be built for it, thus the entire system (model + harness) is a different kind of tool than say Claude code
But did they build this different harness? And are they sure other models can't cope with it?
Right I expected the piece to transition into “and here’s how we built a whole new thing for it” but it never did.
1 reply →