I'm not exactly pro-AI, but even I can see that their system clearly works well in this case. If you tune the model to favour false positives, with a human review step (that's quick), I can image your response time being cut from days to hours (and your customers getting their updates that much faster).
You can't catch everything with normal static analysis either. LLM just produces some additional signal in this case, false negatives can be tolerated.
I'm not exactly pro-AI, but even I can see that their system clearly works well in this case. If you tune the model to favour false positives, with a human review step (that's quick), I can image your response time being cut from days to hours (and your customers getting their updates that much faster).
You are assuming that they build their own models.
He literally said "Flagged packages are escalated to a human review team." in the second sentence. Wtf is the problem here?
What about packages that are not "flagged"? There could be hallucinations when deciding to (or not) "flag packages".
>What about packages that are not "flagged"?
You can't catch everything with normal static analysis either. LLM just produces some additional signal in this case, false negatives can be tolerated.
3 replies →
> We use a mix of static analysis and AI. Flagged packages are escalated to a human review team.
“Chat, I have reading comprehension problems. How do I fix it?”
Reading comprehension problems can often be caught with some static analysis combined with AI.
"LLM bad"
Very insightful.