Comment by 9rx
2 days ago
> Python type hints allow checking correctness statically
Not really. You can do some basic checking, like ensuring you don't pass a string into where an integer is expected, but your tests required to make sure that you're properly dealing with those integers (Python type hints aren't nearly capable enough to forgo that) would catch that anyway. The LLM doesn't care if the error comes from a type checker or test suite.
When you get into real statically typed languages there isn't much consideration for Python. Perhaps you can prompt an LLM to build you an extractor, but otherwise, based on what already exists, your best bet is likely Lean extracted to C, imported as a Python module. Easier would be to cut Python out of the picture, though.
If you are satisfied with the SMT middle-ground, Dafny does support Python as a target. But as the earlier commenter said: Types are best.
I think you're underestimating the current state of the Python type system. With Python 3.12 and pyright or mypy in strict mode, it's very reliable, and it makes those kinds of tests unnecessary. This requires you to fully buy into the idea, though, with 100% of your codebase statically typed and using only typed libraries, unless you're comfortable writing wrappers.
It's not Rust-level, but I'd argue it's better than C or Go's type systems.
The comparison with Rust and not something like Lean, Rocq, or Idris is telling. Rust's type system is not much better than Python's, still requiring tests for everything.
These partial type systems cannot replace any actually useful tests. I'll grant you that testing is the least understood aspect of computer science, leading to a lot of really poorly conceived tests out in the wild. I can buy that those bad, useless tests can be replaced — albeit weren't actually needed in the first place.