← Back to context

Comment by danieltanfh95

6 hours ago

People really lack imagination. The point here is that a dedicated attacker with a good harness and really cheap models can run the attack regardless. It's like portscan/url search attacks. They could run all of these against all codebases and clients. However, on the flip side, this also means we could run cheap models against every PR made, and do a thorough red-team security review.

None of these requires mythos. If anything we just need Opus 4.5+ that is not lobotomised.

That is a point. It might even be true. But showing a small model an example of vulnerable code and asking to confirm that it is vulnerable code isn't evidence for that point!

  • No, it is evidence for that point. You could just rattle off every possible vulnerability and have the cheap model scan for it in the harness through a loop.

    Note that I say cheap, not small, because small models may lack the reasoning needed, but some models are cheap enough but retain enough reasoning (ala Sonnet 3.7+)