Comment by josephg
1 day ago
To be clear, we don’t know that this tool is better at finding bugs than fuzzing. We just know that it’s finding bugs that fuzzing missed. It’s possible fuzzing also finds bugs that this AI would miss.
1 day ago
To be clear, we don’t know that this tool is better at finding bugs than fuzzing. We just know that it’s finding bugs that fuzzing missed. It’s possible fuzzing also finds bugs that this AI would miss.
I would suggest watching Nicholas Carlini's talk and Heather Adkins and Four Flynn's talks from unprompted:
https://youtu.be/1sd26pWhfmg?si=onOai_ocxkZeNWP0
https://youtu.be/B_7RpP90rUk?si=HkRBhw95DbbKX9lL
My takeaway is that fuzzing is not just complementary, it also gives a stronger AI a starting point. But AI is generally faster and better.
Thanks - these talks are mindblowing. Highly recommended.
Different methods find different things. Personally, I'd rather use a language that is memory safe plus a great static analyzer with abstract interpretation that can guarantee the absence of certain classes of bugs, at the expense of some false positives.
The problem is that these tools, such as Astrée, are incredibly expensive and therefore their market share is limited to some niches. Perhaps, with the advent of LLM-guided synthesis, a simple form of deductive proving, such as Hoare logic, may become mainstream in systems software.
This line of reasoning makes no sense when the AI can just be given access to a fuzzer. I would guess that it probably did have access to a fuzzer to put together some of these vulnerabilities.
Carlini talked about that a fair amount in the context of pairing the two: e.g. many protocols are challenging for fuzzers because they have something like a checksum or signature but LLMs are good at coming up with harnesses for things like that. I’m sure that we’re going to see someone building an integrated fuzzer soon which tries to do things like figure out how to get a particular branch to follow an unexercised path.
This is obviously just cope (there's a long, strong-form argument for why LLM-agent vulnerability research is plausibly much more potent than fuzzing, but we don't have to reach it because you can dispose of the whole argument by noting that agents can build and drive fuzzers and triage their outputs), but what I'd really like to understand better is why? What's the impetus to come up with these weird rationalizations for why it's not a big deal that frontier models can identify bugs everyone else missed and then construct exploits for them?
I don't have an anti-AI stance. Maybe I should have spelled that out more clearly in my comment above. I'm as excited and terrified by this technology as everyone else. I think we're all in vicious agreement that we need defense-in-depth - including LLMs and fuzzing (and static analysis and so on).
An LLM can guide all of this work, but current models tend to slowly go off the rails if you don't keep a hand on the wheel. I suspect this new model will be the same. I've had Opus4.6 write custom fuzzing tools from scratch, and I've gotten good results from that. But you just know people will prompt this new model by saying "make this software secure". And it'll forget fuzzing exists at all.
Good lord, why such a virulent response to something that seems like we should be considering?
As someone in cybersecurity for 10+ years my immediate assumption is why not both? I don’t think considering that they could both have their uses is “cope”.
Again: LLM agents already are both. But it's also remarkable and worth digging into the fact that LLM agents haven't needed fuzzers to produce many (any? in Anthropic Red's case?) of the vulnerabilities they're discussing.
4 replies →
You said it yourself. It's cope. That's all it is and all it ever was.
https://en.wikipedia.org/wiki/AI_effect
Every time an AI does something new, there's a human saying "it's not really doing that something", "it's doing that something in a fake way" or "that something was never important in the first place".
Alright, except that’s not what I was saying. I was just pointing out that LLMs don’t replace fuzzing or static analysis. They complement those techniques. And yes, LLMs may drive those techniques directly, but they often don’t. At least not yet.
AI can initate the fuzzing and optimize the process of fuzzing.