Comment by maleldil

2 days ago

But that's not the argument here. Python type hints allow checking correctness statically, which is what matters for agents.

> Python type hints allow checking correctness statically

Not really. You can do some basic checking, like ensuring you don't pass a string into where an integer is expected, but your tests required to make sure that you're properly dealing with those integers (Python type hints aren't nearly capable enough to forgo that) would catch that anyway. The LLM doesn't care if the error comes from a type checker or test suite.

When you get into real statically typed languages there isn't much consideration for Python. Perhaps you can prompt an LLM to build you an extractor, but otherwise, based on what already exists, your best bet is likely Lean extracted to C, imported as a Python module. Easier would be to cut Python out of the picture, though.

If you are satisfied with the SMT middle-ground, Dafny does support Python as a target. But as the earlier commenter said: Types are best.

  • I think you're underestimating the current state of the Python type system. With Python 3.12 and pyright or mypy in strict mode, it's very reliable, and it makes those kinds of tests unnecessary. This requires you to fully buy into the idea, though, with 100% of your codebase statically typed and using only typed libraries, unless you're comfortable writing wrappers.

    It's not Rust-level, but I'd argue it's better than C or Go's type systems.

    • The comparison with Rust and not something like Lean, Rocq, or Idris is telling. Rust's type system is not much better than Python's, still requiring tests for everything.

      These partial type systems cannot replace any actually useful tests. I'll grant you that testing is the least understood aspect of computer science, leading to a lot of really poorly conceived tests out in the wild. I can buy that those bad, useless tests can be replaced — albeit weren't actually needed in the first place.

Yes then you might as well use some other language that uses types but also gets you performance. I agree the ecosystem is missing but hey we have LLMs now

  • I don't understand why you keep bringing up performance. If you're considering using Python, as many projects are, performance is obviously not a concern.

    Python is a good language. Its ecosystem is rich, and I find it very productive. I want to use it, but I also want as much static analysis as possible, so I use ruff and pyright.

  • Performance isn’t the only important metric. There are other pros to weigh. For many apps a language might be performant enough, and bring other pros that make it more appealing than more performant alternatives.