← Back to context

Comment by dmurray

13 hours ago

Type checking in Python involves guessing. Take a program like

x = [3]

Should the type checker guess that the type of x is list[Any], or list[int], or list[Literal[3]], or something else? Libraries you depend on will pose more difficult versions of this question.

So it's not possible to expect the tool to be "perfect", whatever that means - usually people think of it as meaning it allows all code that they think is idiomatic and reasonable, and disallows all code that could raise a TypeError at runtime. You can only expect the tool to be reasonably good and perhaps to have an opinionated design that matches the way you would like to write Python.

Arguably it's the job of a linter and not a type checker, but if your code just assigns random things to random variables that never get used again, I hope something would point that out to you before it passes code review and merged to main, ideally before it even gets sent for review.

Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then. If our goal is maintainable code at scale, the first thing I'd expect of a type checker is for it to communicate with the user, stating that a) the variable in file foo.py on line 43 is missing a type hint and couldn't be guessed with confidence, and that it's just gonna guess it's a list[int]. For bonus points, the tool could run in interactive mode and ask the user directly. But even without a hint, assuming the variable actually is used later, how does it get used?

It seems like you'd just start with the most strict interpretation, list[Literal[3]], and relax it as it gets used. If it gets fed into a function that doesn't have hints and can't be introspected for whatever reason, relax it to list[int] and then further relax it to list[Any] as necessary. Then depending on the mode the user configured the tool to run in, print nothing, or a warning, or error out or ask the user what to do if running interactively. Some of the config options could be strict, tolerant, and loose. Or maybe enforcing, permissive, and loose. Whatever color we want this bike shed to be.

More advanced tools let the user configure individual issues and their corresponding levels, but that may be considered too many options to give the user, overwhelming the user who then doesn't use the tool.

As far as perfect typing goes, I mean, yeah, ideally, after satisfying the type checker in strict mode, which means no types need to be guessed at, as all variables and other locations without type annotations that need them were reported, or even annotated interactively, or if we're real fancy, automatically with a comment. IMO programs should feel free to edit code directly (as a non-default option) That would mean the program could not throw an an uncaught TypeError, unexpectedly. That's a lot to ask of a dynamically strongly typed language, but I'd settle for it never being utterly wrong.

What that means is if the type checker says the function is returning a string, but of the 200 calls to that function, the type checker throws an error and says ten of those callsites expect it to return a dict, all I'm saying is this hypothetical type checker had better be right about that. Nothing worse than losing hours rooting around in the code looking for something that isn't actually there (or an errant semicolon, but this isn't C)

  • > Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then.

    The code is in a third party library which doesn't have comprehensive type hints, and perhaps has type expectations so complicated that they can't be expressed in the Python type system.

    Even if you enforce type hints on every line in your internal code, you're going to be relying at some point on libraries which are reliable but poorly type hinted. If you're not using that vast ecosystem of libraries, Python probably wasn't the best choice.