← Back to context

Comment by dchftcs

6 hours ago

A missing specification in the proof of lean-zip, a lean component, is a real problem to the philosophy and practice of software verification.

To illustrate, let's say you want to verify a "Hello world" program. You'd think a verification involves checking that it outputs "Hello, world!".

However, if a contractor or AI hands you a binary, what do you need to verify? You will need to verify that it does exactly print "Hello, world!", no more, no less. It should write to stdout not stderr. It shouldn't somehow hold a lock on a system resource that it can't clean up. It cannot secretly install a root-kit. It cannot try to read your credentials and send it somewhere. So you will need to specify the proof to a sufficient level of detail to capture those potential deviations.

Broadly, with both bugs, you need to ask a question: does this bug actually invalidate my belief that the program is "good"? And here you are pulling up a fact that the bug isn't found in the Lean kernel, which makes an assumption that there's no side-effect that bleeds over the abstraction boundary between the runtime and the kernel that affects the correctness of the proof; that safety guarantee is probably true 99.99% of the time - but if the bug causes a memory corruption, you'd be much less confident in that guarantee.

If you're really serious about verifying an unknown program, you will really think hard "what is missing from my spec"? And the answer will depend on things that are fuzzier than the Lean proof.

Now, pragmatically, there many ways a proof of correctness adds a lot of value. If you have the source code of a program, and you control the compiler, you can check the source code doesn't have weird imports ("why do I need kernel networking headers in this dumb calculator program?"), so the scope of the proof will be smaller, and you can write a small specification to prove it and the proof will be pretty convincing.

All in all, this is a toy problem that tells you : you can't verify what you don't know you should verify, and what you need to verify depends on the prior distribution of what the program is that you need to verify, so that conditional on the proof you have, the probability of correctness is sufficiently close to 1. There's a lesson to learn here, even if we deem Lean is still a good thing to use.

> A missing specification in the proof of lean-zip, a lean component, is a real problem to the philosophy and practice of software verification.

Every time someone makes this point, I feel obliged to point out that all alternatives to software verification have this exact same problem, AND many, many more.