Comment by snowhale
19 hours ago
the false positive rate (28% on clean binaries) is the real problem here, not the 49% detection rate. if you're running this on prod systems you'd be drowning in noise. also the execl("/bin/sh") rationalization is a telling failure -- the model sees suspicious evidence and talks itself out of it rather than flagging for review.
No comments yet
Contribute on Hacker News ↗