Comment by moritzwarhier

3 hours ago

The whole premise of a "giant secure open source C codebase" seems questionable

Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.

In other words, there is no strict definition of what is a security issue.

On the other hand, it is (both halting and spec adherence) are checkable under compute and space constraints though? :) I'd say the biggest hurdle are means to describe the spec in way that is easy enough for a human to produce to make it feasible.

Not a DB person either, but things like TLA+ seem very hard to write even with LLMs. Behavioral tests with an enumerable number of random paths to take (aka model checking - eg jepsen) seem more feasible. Although you can't check internal properties of the system (string `pass` or any of it's copies or parts are not held anywhere in memory at any point between lines A and B) unless we can check that two memory dumps are indistinguishable with different pass strings (assuming we abstracted away storage devices in a test environment).. Also not sure if it's "easy enough" to write such tests either.

Maybe the reason is that OS domain objects / primitives are too complex and not "isolatable" enough / lack a clear contract at all? (Hence multi file refactorings that break invariants.)