← Back to context

Comment by tptacek

6 hours ago

That is a point. It might even be true. But showing a small model an example of vulnerable code and asking to confirm that it is vulnerable code isn't evidence for that point!

No, it is evidence for that point. You could just rattle off every possible vulnerability and have the cheap model scan for it in the harness through a loop.

Note that I say cheap, not small, because small models may lack the reasoning needed, but some models are cheap enough but retain enough reasoning (ala Sonnet 3.7+)