← Back to context

Comment by Xylakant

1 day ago

1 memory safety vulnerability, that's a pretty important distinction.

Yes, I believe that at least the order of magnitude is correct because 4 800 000 of those lines are guaranteed to not have any by virtue of the compiler enforcing memory safety.

So it's 1 per 200 000, which is 1-2 orders of magnitude worse, but still pretty darn good. Given that not all unsafe code actually has potential for memory safety issues and that the compiler still will enforce a pretty wide set of rules, I consider this to be achievable.

This is clearly a competent team that's writing important and challenging low-level software. They published the numbers voluntarily and are staking their reputation on these reports. From personal observation of the Rust projects we work on, the results track with the trend.

There's no reason for me to disbelieve the numbers put forward in the report.

1 per 5M or 1 per 200K is pretty much unbelievable, especially in such a complex codebase, so all I can say then is to each their own.

  • > especially in such a complex codebase

    You accidentally put the finger on the key point, emphasis mine.

    When you have a memory-unsafe language, the complexity of the whole codebase impact your ability to uphold memory-related invariants.

    But unsafe block are, by definition, limited in scope and assuming you design your codebase properly, they shouldn't interact with other unsafe blocks in a different module. So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside. And that makes everything much more tractable since you never have to reason about the whole codebase, but only about a limited scope everytime.

    • No, this is just an example of confirmation bias. You're given a totally unrealistic figure of 1 vuln per 200K/5M LoC and now you're hypothesizing why that could be so. Google, for anyone unbiased, lost the credibility when they put this figure into the report. I wonder what was their incentive for doing so.

      > But unsafe block are, by definition, limited in scope and assuming you design your codebase properly, they shouldn't interact with other unsafe blocks in a different module. So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside. And that makes everything much more tractable since you never have to reason about the whole codebase, but only about a limited scope everytime.

      For anyone who has written low-level code with substantial complexity knows that this is just a wishful thinking. In such code, abstractions fall-apart and "So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside" is just wrong as I explained in my other comment here - UB taking place in unsafe section will transcend into the rest of the "safe" code - UB is not "caught" or put into the quarantine with some imaginative safety net at the boundary between the safe and unsafe sections.

      3 replies →

    • That's the other interesting observation you can draw from that report. The numbers contained in the first parts about review times, rollback rates, etc. are broken down by change size. And the gap widens for larger changes. This indicates that Rusts language features support reasoning about complex changesets.

      It's not obviously clear to me which features are the relevant ones, but my general observation is that lifetimes, unsafe blocks, the borrow checker allow people to reason about code in smaller chunks. For example knowing that there's only one place where a variable may be mutated supports understanding that at the same time, no other code location may change it.