Comment by felipeerias

5 hours ago

From the article:

> Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior").

I mean you can still scale that? Ask a lighter model to go through every function to find vulnerabilities, take output to bigger model like Opus and classify the critical ones.