Comment by cubefox
3 days ago
One thing I didn't know until recently was that Lean doesn't solve the problem of verifying that a formal theorem actually states what we think it states rather than something else.
In some cases that's not an issue, because the formal statement of the theorem is very simple. E.g. for Fermat's Last Theorem, which can be formally expressed with "(x y z : ℕ+) (n : ℕ) (hn : n > 2) : x^n + y^n ≠ z^n". But in other cases it might be much harder to verify that a formal statement actually matches the intuitive informal statement we want.
Of course, Lean or similar languages still offer the huge potential of verifying that a formal statement is provable, even if they don't help with verifying that the formal statement matches the intended informal statement in natural language.
Are there, in general, ways to do that? Sorry if this sounds like a troll question, but I struggle to think of how one would verify that a formal statement matches the accepted understanding other than painstakingly going through the definitions with experts from both sides.
Yeah this seems like the specification/implementation problem. One can perfectly implement a bad spec, but coming up with the perfect spec is a human problem.
It's interesting that this rather fundamental problem doesn't seem to have an established name and Wikipedia article. At least I couldn't find one. Perhaps the problem is too obvious.
I don't think there is any technical solution for that, apart perhaps from asking an LLM, or multiple different LLMs, whether it thinks the formal definition matches the informal statement.
When mathematicians check normal (informal) proofs they usually verify both things at once: that the proof doesn't contain mistakes or insufficiencies, and that it proves what the author says it proves. Formal proof checkers aren't able to do the latter.
I personally view proof checkers as mathematicians' tools so I assume mathematicians would be involved either way. With some percentage actually preferring to work closely with these tools. See also Terence Tao's comment in https://terrytao.wordpress.com/2025/05/31/a-lean-companion-t... which feels relevant to me
2 replies →
Yeah, by adopting Lojban as our common language.
Joking of course, but only because Lojban doesn’t have formally specified semantics.
Computational semantics could, in principle, allow you to enumerate the possible logical meanings of a given informal English phrasing of the statement, and these can be either each proven or thrown out with explicit justification for why that was not what you meant.
Nope! Not a troll question. At the end of the day, we're human beings interacting with a bunch of electrified rocks. There will always be a layer of subjectivity and expertise involved in translating informal (or perhaps even unknown) desires into formal specs. It gets even worse when you are designing software that needs to take into account the underlying behavior of hardware; you wouldn't program an autopilot for a jumbo jet whose certification only exists in the context of some simplified formal model for how computers work, do you?
But I'm catastrophizing. If you think going through the definitions of a spec are painstaking and require expert insight to get right, it is miniscule compared to e.g. auditing every line of a codebase written in an imperative language. And not only that, to even get close to the same correctness guarantees of e.g. a formally verified pipeline through CompCert, you would have to audit every line of the codebases of every compiler that touches your code (https://r6.ca/blog/20200929T023701Z.html "It Is Never a Compiler Bug Until It Is"), together with analysis of the underlying hardware models.
All this to say, technology is more complicated than traditional programmers exposed only to high-level languages might be led to believe. That complication can only be reduced, consolidated, or blackboxed so much.
Yeah Lean is actually pretty interesting in that sense because it’s designed to have a small kernel that actually does the type theory checking, and that kernel has a specification, tests, and independent reimplementations. The kernel really is very small compared to the entirety of Lean syntax and behavior — everything else runs and is elaborated before stuff feeds into the kernel. So the surface area for actual proof checking bugs is greatly reduced.
1 reply →
>Lean doesn't solve the problem of verifying that a formal theorem actually states what we think it states rather than something else.
With all honesty, if a tool existed that did that we would have solved maths. Or at least we would have solved human communication. In both cases it would be the most studied thing in the world, so I don't know where this belief would come from