Comment by tptacek

1 day ago

Again: LLM agents already are both. But it's also remarkable and worth digging into the fact that LLM agents haven't needed fuzzers to produce many (any? in Anthropic Red's case?) of the vulnerabilities they're discussing.

4 comments

tptacek

josephg 1 day ago

Do we know that? I'd love to see some of the ways security researchers are using LLMs. We have no idea if claude was using fuzzing here, or just reading the files and spotting bugs directly in the source code.

A few weeks ago someone talked about their method for finding bugs in linux. They prompted claude with "Find the security bug in this program. Hint: It is probably in file X.". And they did that for every file in the repo.

0123456789ABCDE 11 hours ago

> Since then, this weakness has been missed by every fuzzer and human who has reviewed the code, and points to the qualitative difference that advanced language models provide. [^1]
> At no point in time does the program take some easy-to-identify action that should be prohibited, and so tools like fuzzers can’t easily identify such weaknesses. [^2]
[^1]: https://red.anthropic.com/2026/mythos-preview/#:~:text=Since...
[^2]: https://red.anthropic.com/2026/mythos-preview/#:~:text=At%20...

ofjcihen 1 day ago

Are you saying that LLMs can use fuzzers or are you saying that they work like fuzzers? Because one of those is less…deterministic? Then the other.

Regardless and in the spirit of my original response my answer would be to give the LLM access to a fuzzer (plus other tools etc) but also have fuzzers in the pipeline. Partially because that increases the determinism in the mix and partially because why not? Layering is almost always better than not.

But again more than anything I’m focusing on the accusations of cope. People SHOULD have measured reactions to claims about any product. People SHOULD be asking questions like this. I know that the LLM debate is often “spicy” but man let’s just try to lower the temperature a bit yeah?

tptacek 1 day ago

LLMs can use fuzzers and also LLMs can explore the semantic space of a program in ways fuzzers can't.