Comment by alganet

2 months ago

> Several people have written lengthy, detailed responses to your questions.

No, they haven't. Just read the thread.

> You're partly right that credentials alone don't prove anything.

I am totally right. Saying "believe me, I work on this" is lazy and a bad argument. There was simply no technical discussion to back that up.

> When tptacek mentioned his background, he also explained that static analysis tools have failed commercially for decades despite massive investment

I am not convinced that static analysis tools failed that hard. When I mentioned sanitizers, for example, he simply disappeared from the conversation and left that subject.

Also, suddenly, 22 bugs are found and there's a new holy grail in security analysis? You must understand that this is not enough.

> You've decided that people disagreeing with you are trying to manipulate you with authority

That's not a decision. An attempt to use credentials happened and it's there for anyone to see. It's blatant, I don't need to frame it.

> SAST tools with human triage have been ineffective for 20+ years despite enormous investment

I am not convinced that this is true. As I mentioned, sanitizers work really well at mitigating lots of security issues. They're an industry standard.

Also, I am not totally convinced that the LLM solution is that much better. It's fairly recent, only found a couple of bugs, and it still has much to prove before it becomes a valuable tool. Promising, but far from the holy grail you folks are implying it to be.

> if you want people to keep engaging with you

I want reasonable, no-nonsense people engaging with me. You seem to imply that your time somehow is more valuable than mine, that's me who owe you somehow. That is simply not true.

4 comments

alganet

refulgentis 2 months ago

On sanitizers: You brought them up earlier and the conversation moved on without really addressing it. That's a legitimate complaint. Sanitizers (AddressSanitizer, MemorySanitizer, etc.) are extremely effective at what they do. If the claim is "static analysis has been useless for 20 years," that's obviously too strong—plenty of tools have found real bugs and prevented real vulnerabilities.

On the evidence: These LLM-assisted tools are quite new. Curl finding 22 potential issues is interesting but you're right that it's early days. Declaring this definitively transformative based on limited public evidence is probably premature.

But let's be clear about something else:

You've been consistently rude to people trying to engage with you. Multiple people wrote lengthy, substantive responses. You can scroll up and count the paragraphs. Saying "No, they haven't. Just read the thread" is one of the nicest ways you've engaged. You assume we are dishonest or you genuinely can't recognize when someone's making an effort.

When someone asks you to be more courteous, "it's really not my problem" is a dick move. Nobody said you owe anyone deference. The ask was simpler: don't call good-faith engagement manipulation or suppression or ranking. That's basic forum etiquette, not hierarchy.

And this: "You seem to imply that your time somehow is more valuable than mine, that's me who owe you somehow." Nobody implied that. Several people spent time explaining things. You've spent time questioning them. That's symmetrical. What's asymmetrical is that you keep framing their explanations as evasion or authority-wielding while treating your skepticism as pure rationality. That's exhausting to deal with.

Fuck man. I haven't got a paycheck in 2 years. Your time is worth more objectively. You make up reasons to infer we're actually saying "you're worthless", which then require your interlocutor to point of they couldn't have been, as they are objectively worse than you on whatever metric you mind-read they were comparing you on. Really sick behavior, even though I am sure it is unintentional and you really do think you're being put down like we're at the 5th grade lunch table. I've never had to roll over and show my belly and do "I'm unemployed!!11!" thing just to get someone to stop being a dick. I've had to do it twice so far.

On the actual technical question:

The narrower claim (which might be more defensible) is that SAST tools generate enormous amounts of output that requires expert triage, and that triage step has been the bottleneck. Humans don't scale to it; it's tedious and expensive. If LLMs can effectively automate that triage—not find new classes of bugs, but filter and prioritize what existing analyzers already flag—that could be valuable even if the underlying analysis is traditional.

Your architectural model (verbose analyzer → LLM triage) might be basically correct. The disagreement may just be about how significant that triage step is. You think it's 1% of the value. Others think the triage bottleneck was the whole reason these tools didn't work at scale.

That's a real technical question worth discussing. But it requires assuming people disagree with you because they actually disagree, not because they're trying to bamboozle you with credentials.

alganet 2 months ago
> The disagreement may just be about how significant that triage step is.
Whether it is important or not is highly subjective.
My statement is that _the quality is capped by the non-AI portion of the solution_, which is an objective statement. It means the solution should get better with a better static analyzer, but it probably won't get much better with a better model. That is a testable prediction that might reveal itself to be true or not. It's right there in the first comments I made.
> not because they're trying to bamboozle you with credentials
Let's not use credentials then!
- refulgentis 2 months ago
  
  *Finally.* After dozens of exchanges, you've stated a clear, testable technical claim: quality is capped by the static analyzer, not the LLM; better models won't help much, better analyzers will. That's actually interesting and worth discussing.
  But let's be clear about how we got here. You opened with "Something sounds fishy...I don't think they were [found by AI]." When challenged, you moved to "I concede LLM involvement but want to specify its role." Now you're at a specific testable hypothesis about quality caps. That's a lot of ground to cover while insisting everyone else has been evasive.
  On your technical claim:
  You might be wrong. Here's why the LLM could matter more than you think:
  Static analyzers produce massive amounts of potential findings. The problem has never been "they can't detect anything"—it's that they detect too much, with too many false positives, requiring expert judgment to separate signal from noise. That triage step requires understanding code context across files, project architecture and conventions, whether a potential issue is reachable, whether existing mitigations make it irrelevant, and how severe it actually is
  If LLMs can do that context synthesis effectively—and early evidence suggests they can—then the bottleneck shifts. Your prediction assumes the analyzer's initial detection is the limiting factor. The opposing view is that contextualized triage is the limiting factor, and LLMs are good at exactly that kind of synthesis.
  That's testable. Run the same analyzer with human triage, basic filtering, and LLM triage. If you're right, they'll find the same bugs. If others are right, LLM triage will surface meaningful issues the other approaches miss.
  On "there was simply no technical discussion":
  This is flatly false. tptacek explained that SAST tools have been commercially ineffective for decades despite hundreds of millions in investment, that the triage bottleneck was the problem, and that LLM orchestration is the new variable. That's technical substance. You dismissed it, but it was there.
  I described using GPT-3 to port color science code across multiple languages, explaining direct experience with AI-assisted development. That's concrete technical detail.
  You can disagree with these points. But claiming they don't exist is either dishonest or you're not actually reading what people write.
  On sanitizers:
  You're using this as evidence that static analysis didn't fail, but sanitizers (AddressSanitizer, MemorySanitizer, etc.) are dynamic analysis—runtime instrumentation, not static analysis. They're not counterexamples to claims about SAST tools. The conversation moved on because your example was off-topic.
  On "let's not use credentials":
  Show me where someone did. Find me one comment where someone said "this is true because I worked at Google, full stop" without also providing technical explanation.
  You can't, because it didn't happen. Every time credentials came up, they were context for a substantive technical point. I mentioned my background while explaining my direct experience. tptacek identified as a security professional while explaining the SAST triage problem. You've been fighting a phantom so you could righteously reject authority instead of engaging with the actual arguments being made.
  On the pattern:
  You've consistently reframed disagreement as suppression. People are "trying to make me stop describing how to achieve a similar quality system." They're "upset" their authority isn't working. They're being "evasive" without you ever specifying what's being evaded
  This isn't skepticism. It is a reflexive defensiveness that treats every substantive response as an attack. It's made this conversation take 10x longer than necessary and turned it into arguments about the arguments instead of the actual technical question.
  The bottom line:
  You have a testable hypothesis about whether LLM triage is transformative or marginal. That's worth discussing. But you've been needlessly unpleasant, demonstrably wrong about what's in this thread, and you've burned a lot of goodwill from people who tried to engage you seriously.
  If you want to talk about the technical question, I'm here. But stop pretending you've been stonewalled when multiple people have given you detailed responses you simply didn't like.
  
  1 reply →