Comment by krschacht
2 months ago
At the end of this article it states, "Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior"). A real autonomous discovery pipeline starts from a full codebase with no hints." I'm not a cybersecurity expert, but isn't 80% of the challenge finding where the exploit lives in the code!?
That really undermines the author's claims. This article feels dishonest in it's claim that "small, cheap, open-weights models ... recovered much of the same analysis."
No comments yet
Contribute on Hacker News ↗