Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
Edit: maybe there are large companies that use them behind the curtains, but AWS is the only place I know of where they publicly acknowledge how much they appreciate and use formal methods. If you know any of them, please comment and I'd be curious to learn about how they're using it!
Microsoft, Amazon, Oracle, Google, all sorts of large companies use them, and talk about it, publicly. They've all published whitepapers and resources about them. Microsoft even employs Dr. Leslie Lamport who created and maintains TLA+ (among other things).
I don't think they are underrated. They are heavily used where they work really well and bugs have a very high cost (e.g. hardware design).
For the vast majority of software though they don't really make much sense because formally verifying the software is 10-100x more work than writing it and testing it with normal methods. And formal verification of software generally requires faaaaar more expertise than most people have. (The situation is much better for hardware design because it tends to be way simpler.)
It's a very powerful tool but also extremely difficult to use.
> Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
At least Microsoft and Google poured a lot of money into this by funding well-staffed multi-year research projects. There's plenty of public trail in terms of research papers. It's just that not a whole lot came out of it otherwise.
The problem isn't that the methods are underrated, it's that they aren't compatible with the approach to software engineering in these places (huge monolithic codebases, a variety of evolving languages and frameworks, no rigid constraints on design principles).
Can you ELI5 what formal methods are and how not the industry standard apparently? As a complete noob, from what I'm reading online, they're pretty much how you should approach software engineering, or really any sort of programming right?
> Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
I don't think that's true at all. I suppose that depends on what you mean by formal methods and in what context you're concerned about those. Off the top of my head this comes to mind from Microsoft: https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
I love that language and frequently show it to people. I'm sad to see that my local install doesn't work any more. I actually used it to solve a puzzle in Evoland 2 that I'm relatively sure was added as a joke, and is not solvable in a reasonable time without a solver. I'm actually doing a PhD in compilers right now, and would love to chat about sentient if you have the time. My email is sasha@lopoukhine.com.
Haha, Marijn Heule who is pushing a lot of limits of SAT solving would love this. If you manage to get him excited, he might spend a few years on this problem :) He's kinda famous for solving the Boolean Pythagorean Triples problem using SAT [1]. He loves puzzles. He also got Knuth excited about a bunch of fun puzzles.
BTW, these puzzles also tend to have a lot of symmetries, which SAT solvers are pretty bad at handling. You can break them, though, using a variety of techniques, e.g. static symmetry breaking [2], or symmetric learning.
How good are current LLMs at translating problems given as text into something SMT solvers can operate on? Be it MiniZinc, Z3, Smtlib, Python bindings, etc. Anyone tried it out?
I've found them to be bad, for the most part. There aren't enough blog posts and examples of code out there for them to leach from.
Besides which, I would argue the process of writing proof in the language is integral to building the understanding you need to deal with the results. You'll spot bugs as you're creating the model.
If you want a language for expressing constraint satisfaction problems that's higher-level than SAT, I think MiniZinc is pretty interesting. https://www.minizinc.org/
I actually wrote a backtracking solution to the LinkedIn queens game a while ago (and the tango game).
I know nothing about SMT or SAT and I imagine they might be faster, but the backtracking appears to solve just as instantaneously when you push the button.
Might be cool to learn a bit about SMT or SAT. Not sure how broadly applicable they are but I've seen people use them to solve some difficult advent of code problems if nothing else.
NP-complete are the hardest problems in NP. Cook in 1971 proved SAT to be in NP-complete. In the worst case for any other problem in NP, we can quickly (i.e., in polynomial time) convert instances of that problem into instances of SAT. In other words, we can use SAT to solve any problem in NP.
It turns out there are many problems in NP-complete. The fast conversion applies among them too, so in some sense, problems in NP-complete are all the same because we can use them all to solve instances of each other. However, for some of those problem instances the best known algorithm is to try all possible inputs, which requires exponential time (very, very slow for even modestly large inputs).
Lots of research has been and continues to be poured into SAT because any gains automatically yield improvements to everything else in NP-complete and the rest of NP. Using a SAT solver allows you to hitch a ride more or less for free on the results of all that research. Each incremental improvement to SAT solvers benefits programs that use them.
As the author noted, forming SAT instances by hand can be a pain. SMT or SAT Modulo Theories is sort of a high-level language that “compiles down” to SAT. Expressing problems with SMT is more natural and reduces the burden of converting your problem to SMT and SMT solutions back to your problem domain.
you mentioned SMT is slower than SAT and left it there, but that feels incomplete.
in problems like this, solve time barely matters unless you’re generating at scale.
the real weight is in how fast you can write, refactor, and trust the constraints.
SAT might give faster outputs, but SMT usually gets you to correct models quicker with less friction.
wondering if you actually benchmarked both and stuck with SAT on numbers, or if it was more of a default comfort pick.
felt like a missed moment to shift the lens from solver speed to model dev loop speed
> Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. Glucose is really, really fast.
I'm new to this area, neither the original article nor the link to Glucose have enough info to tell me order of magnitude here: milliseconds? hours?
Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
Edit: maybe there are large companies that use them behind the curtains, but AWS is the only place I know of where they publicly acknowledge how much they appreciate and use formal methods. If you know any of them, please comment and I'd be curious to learn about how they're using it!
Microsoft, Amazon, Oracle, Google, all sorts of large companies use them, and talk about it, publicly. They've all published whitepapers and resources about them. Microsoft even employs Dr. Leslie Lamport who created and maintains TLA+ (among other things).
Just for some quick examples:
Microsoft: https://github.com/Azure/azure-cosmos-tla, https://www.youtube.com/watch?v=kYX6UrY_ooA
Google: https://research.google/pubs/specifying-bgp-using-tla/, https://www.researchgate.net/publication/267120559_Formal_Mo...
Oracle: https://blogs.oracle.com/cloud-infrastructure/post/sleeping-... (note the author is a "Formal Verification Engineer", it's literally his job at Oracle to do this stuff)
Intel: https://dl.acm.org/doi/10.1145/1391469.1391675, https://link.springer.com/chapter/10.1007/978-3-540-69850-0_...
Jetbrains: https://lp.jetbrains.com/research/hott-and-dependent-types/
Arm: https://ieeexplore.ieee.org/document/9974354
Lamport+Microsoft was the first thing that I thought of when I read the comment. FWIW he retired at the beginning of this year.
I don't think they are underrated. They are heavily used where they work really well and bugs have a very high cost (e.g. hardware design).
For the vast majority of software though they don't really make much sense because formally verifying the software is 10-100x more work than writing it and testing it with normal methods. And formal verification of software generally requires faaaaar more expertise than most people have. (The situation is much better for hardware design because it tends to be way simpler.)
It's a very powerful tool but also extremely difficult to use.
> Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
At least Microsoft and Google poured a lot of money into this by funding well-staffed multi-year research projects. There's plenty of public trail in terms of research papers. It's just that not a whole lot came out of it otherwise.
The problem isn't that the methods are underrated, it's that they aren't compatible with the approach to software engineering in these places (huge monolithic codebases, a variety of evolving languages and frameworks, no rigid constraints on design principles).
Can you ELI5 what formal methods are and how not the industry standard apparently? As a complete noob, from what I'm reading online, they're pretty much how you should approach software engineering, or really any sort of programming right?
1 reply →
> Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.
I don't think that's true at all. I suppose that depends on what you mean by formal methods and in what context you're concerned about those. Off the top of my head this comes to mind from Microsoft: https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
SAT solvers and the algorithms surrounding them are so much fun. I agree they are very unappreciated.
Shameless plug: I wrote a (admittedly very deriative) introduction with some examples I thought at the time were cool.
https://www.nhatcher.com/post/on-hats-and-sats/
Thanks that was quite informative, perfect for me.
SMT is so much fun. The Z3 Python api lets you write your problem very directly and then gives you fast answers, even for quite large problems.
Note that CVC5 has basically the same Python API ( https://cvc5.github.io/docs/cvc5-1.1.2/api/python/python.htm... ) and is often much faster
I did write a shockingly similar solution few months ago:
https://gist.github.com/enigma/98ea0392471fa70211251daa16ce8...
This post is the programming joke about Python, "import solution; solution()".
Barely a joke, this is literally what using the Python Z3 bindings feels like.
I tried to write a programming language that compiles to SAT many years ago: https://sentient-lang.org/
I love that language and frequently show it to people. I'm sad to see that my local install doesn't work any more. I actually used it to solve a puzzle in Evoland 2 that I'm relatively sure was added as a joke, and is not solvable in a reasonable time without a solver. I'm actually doing a PhD in compilers right now, and would love to chat about sentient if you have the time. My email is sasha@lopoukhine.com.
I remember you showing me this! Wow that was a long time ago.
Haha, Marijn Heule who is pushing a lot of limits of SAT solving would love this. If you manage to get him excited, he might spend a few years on this problem :) He's kinda famous for solving the Boolean Pythagorean Triples problem using SAT [1]. He loves puzzles. He also got Knuth excited about a bunch of fun puzzles.
BTW, these puzzles also tend to have a lot of symmetries, which SAT solvers are pretty bad at handling. You can break them, though, using a variety of techniques, e.g. static symmetry breaking [2], or symmetric learning.
[1] https://www.cs.utexas.edu/~marijn/ptn/ [2] https://github.com/markusa4/satsuma
How good are current LLMs at translating problems given as text into something SMT solvers can operate on? Be it MiniZinc, Z3, Smtlib, Python bindings, etc. Anyone tried it out?
Apparently they're getting very good: https://emschwartz.me/new-life-hack-using-llms-to-generate-c...
I try not to use them too much because I want to build the skill of using SMTs directly for now.
I've found them to be bad, for the most part. There aren't enough blog posts and examples of code out there for them to leach from.
Besides which, I would argue the process of writing proof in the language is integral to building the understanding you need to deal with the results. You'll spot bugs as you're creating the model.
I tried it many months ago and it was garbage. But this was trying smtlib directly. Maybe via the python bindings it works better?
Reminds me of a small project I did back in undergrad: Minesweeper using a SMT solver. https://github.com/stong/smt-minesweeper
If you want a language for expressing constraint satisfaction problems that's higher-level than SAT, I think MiniZinc is pretty interesting. https://www.minizinc.org/
I actually wrote a backtracking solution to the LinkedIn queens game a while ago (and the tango game).
I know nothing about SMT or SAT and I imagine they might be faster, but the backtracking appears to solve just as instantaneously when you push the button.
Might be cool to learn a bit about SMT or SAT. Not sure how broadly applicable they are but I've seen people use them to solve some difficult advent of code problems if nothing else.
Definite broad applicability.
NP-complete are the hardest problems in NP. Cook in 1971 proved SAT to be in NP-complete. In the worst case for any other problem in NP, we can quickly (i.e., in polynomial time) convert instances of that problem into instances of SAT. In other words, we can use SAT to solve any problem in NP.
It turns out there are many problems in NP-complete. The fast conversion applies among them too, so in some sense, problems in NP-complete are all the same because we can use them all to solve instances of each other. However, for some of those problem instances the best known algorithm is to try all possible inputs, which requires exponential time (very, very slow for even modestly large inputs).
Lots of research has been and continues to be poured into SAT because any gains automatically yield improvements to everything else in NP-complete and the rest of NP. Using a SAT solver allows you to hitch a ride more or less for free on the results of all that research. Each incremental improvement to SAT solvers benefits programs that use them.
As the author noted, forming SAT instances by hand can be a pain. SMT or SAT Modulo Theories is sort of a high-level language that “compiles down” to SAT. Expressing problems with SMT is more natural and reduces the burden of converting your problem to SMT and SMT solutions back to your problem domain.
Hah, about a month ago I wrote a DLX solver for exact cover problems and LiQueens was one of my first implementations.
Next I want to try to solve the Tango and Zip games.
you mentioned SMT is slower than SAT and left it there, but that feels incomplete. in problems like this, solve time barely matters unless you’re generating at scale. the real weight is in how fast you can write, refactor, and trust the constraints. SAT might give faster outputs, but SMT usually gets you to correct models quicker with less friction. wondering if you actually benchmarked both and stuck with SAT on numbers, or if it was more of a default comfort pick. felt like a missed moment to shift the lens from solver speed to model dev loop speed
I was briefly looking into using SMT for Minecraft autocrafting, but it turns out you can do integer linear programming and the mapping is easier.
Some additional context: Outside of Microsoft, this puzzle is often known as Star Battle.
Terrific little puzzle, highly recommend it!
https://www.puzzles.wiki/wiki/Star_Battle
https://www.puzzle-star-battle.com/?size=5
What about a CP solver?
> Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. Glucose is really, really fast.
I'm new to this area, neither the original article nor the link to Glucose have enough info to tell me order of magnitude here: milliseconds? hours?
The article fails to even say what SMT is. It also fails to describe and explain it. This article should help:
https://en.wikipedia.org/wiki/Satisfiability_modulo_theories
In the article:
> "Satisfiability Modulo Theories"
Huh, where? I didn't see it going through and opening it in reader mode and ctrl+f-ing any of those words turns up nothing for me still.
1 reply →
[flagged]
1 reply →