Comment by evolve-maz

13 hours ago

If I flag every line in your PR as a potential security bug then I have 100% recall.

Obviously you need a mixture of high recall and low false positive rate. If 7/8 flagged items are fine its much more likely people will ignore the warnings, much like they would any security tool with a 90% false positive rate. That is not optimized for the customer.

3 comments

evolve-maz

onion2k 13 hours ago

The ideal is finding all the problems without getting any false positives, but the reality is that you can't often have that. An org's engineering culture should be designed to fix problems with systems. If you're seeing an 87.5% false positive rate that should be seen as another engineering problem to fix. However, that's a separate issue to whether or not you accept false positives in a system designed to find problems.

Presenting it as either a system that misses real problems or a system that has a huge number of false positives is a false dilemma. You can have a system that's designed to find all the problems and then optimize it to reduce the false positives. If you can't reduce the number then you optimize to identify false positives as fast as possible. Just ignoring the identified problems on the assumption that they're false is giant red flag and a signal that the org has a very a broken engineering culture (but, as you say, that's quite common.)

eranation 13 hours ago

Yep. Similarly - you can predict with 99.9% accuracy if a Volcano will erupt today by using a rock that has "No" written on it.

williamdclt 11 hours ago

> If I flag every line in your PR as a potential security bug then I have 100% recall.

No. A code review isn't about "flagging a line of code", it's about identifying an issue or a risk. If a 10-line PR has one issue and you leave a comment on every single character, if you still miss the issue you have 0% recall.