Comment by danabramov

3 days ago

Are there, in general, ways to do that? Sorry if this sounds like a troll question, but I struggle to think of how one would verify that a formal statement matches the accepted understanding other than painstakingly going through the definitions with experts from both sides.

10 comments

danabramov

tunesmith 3 days ago

Yeah this seems like the specification/implementation problem. One can perfectly implement a bad spec, but coming up with the perfect spec is a human problem.

cubefox 2 days ago

It's interesting that this rather fundamental problem doesn't seem to have an established name and Wikipedia article. At least I couldn't find one. Perhaps the problem is too obvious.

cubefox 3 days ago

I don't think there is any technical solution for that, apart perhaps from asking an LLM, or multiple different LLMs, whether it thinks the formal definition matches the informal statement.

When mathematicians check normal (informal) proofs they usually verify both things at once: that the proof doesn't contain mistakes or insufficiencies, and that it proves what the author says it proves. Formal proof checkers aren't able to do the latter.

danabramov 3 days ago
I personally view proof checkers as mathematicians' tools so I assume mathematicians would be involved either way. With some percentage actually preferring to work closely with these tools. See also Terence Tao's comment in https://terrytao.wordpress.com/2025/05/31/a-lean-companion-t... which feels relevant to me
- cubefox 3 days ago
  
  Tao misunderstands the question here: it was about reconciling a traditional (informal) proof and a formal proof which come to opposite conclusions, not about two different formal proofs.
  
  1 reply →

adastra22 3 days ago

Yeah, by adopting Lojban as our common language.

Joking of course, but only because Lojban doesn’t have formally specified semantics.

Computational semantics could, in principle, allow you to enumerate the possible logical meanings of a given informal English phrasing of the statement, and these can be either each proven or thrown out with explicit justification for why that was not what you meant.

Paracompact 3 days ago

Nope! Not a troll question. At the end of the day, we're human beings interacting with a bunch of electrified rocks. There will always be a layer of subjectivity and expertise involved in translating informal (or perhaps even unknown) desires into formal specs. It gets even worse when you are designing software that needs to take into account the underlying behavior of hardware; you wouldn't program an autopilot for a jumbo jet whose certification only exists in the context of some simplified formal model for how computers work, do you?

But I'm catastrophizing. If you think going through the definitions of a spec are painstaking and require expert insight to get right, it is miniscule compared to e.g. auditing every line of a codebase written in an imperative language. And not only that, to even get close to the same correctness guarantees of e.g. a formally verified pipeline through CompCert, you would have to audit every line of the codebases of every compiler that touches your code (https://r6.ca/blog/20200929T023701Z.html "It Is Never a Compiler Bug Until It Is"), together with analysis of the underlying hardware models.

All this to say, technology is more complicated than traditional programmers exposed only to high-level languages might be led to believe. That complication can only be reduced, consolidated, or blackboxed so much.

danabramov 3 days ago
Yeah Lean is actually pretty interesting in that sense because it’s designed to have a small kernel that actually does the type theory checking, and that kernel has a specification, tests, and independent reimplementations. The kernel really is very small compared to the entirety of Lean syntax and behavior — everything else runs and is elaborated before stuff feeds into the kernel. So the surface area for actual proof checking bugs is greatly reduced.
- Paracompact 3 days ago
  
  Yeah, the goal for formal methods boils down to only two things: reduce surface area necessary for auditing (e.g. high-level Lean theorem and definitions file), and consolidate that surface area into preexisting, pre-audited technologies (e.g. the Lean kernel).