We are happy with the attention that ty is starting to receive, but it's important to call out that both ty and pyrefly are still incomplete! (OP mentions this, but it's worth emphasizing again here.)
There are definitely examples cropping up that hit features that are not yet implemented. So when you encounter something where you think what we're doing is daft, please recognize that we might have just not gotten around to that yet. Python is a big language!
Totally orthogonal question, but since you're deep in that side of Rust dev -
The subject of a "scripting language for Rust" has come up a few times [1]. A language that fits nicely with the syntax of Rust, can compile right alongside rust, can natively import Rust types, but can compile/run/hot reload quickly.
Do you know of anyone in your network working on that?
And modulus the syntax piece, do you think Python could ever fill that gap?
I don't know that I'd want the scripting language to be compiled, for reasons that are outside the scope of this reply. So removing that constraint, the coolest thing I've seen in this space recently is kyren's Piccolo:
> And modulus the syntax piece, do you think Python could ever fill that gap?
I would never ever want a full fledged programming language to build type checking plugins, and doubly so in cases where one expects the tool to run in a read-write context
I am not saying that Skylark is the solution, but it's sandboxed mental model aligns with what I'd want for such a solution
I get the impression the wasm-adjacent libraries could also help this due to the WASI boundary already limiting what mutations it is allowed
Most of the time, you want the type to be dynamic in a scripting langage, as you don't want to expose the types to the user. With this in mind, rhai and rune are pretty good.
On the python front, there was also the pyoxidizer thing, put it seems dead.
I am very interested in both of these. Coming from the TypeScript world I'm really interested in the different directions (type inference or not, intersections and type narrowing...). As a Python developer I'm wearily resigned to there being 4+ python type checkers out there, all of which behave differently. How very python...
Following these projects with great interest though. At the end of the day, a good type checker should let us write code faster and more reliably, which I feel isn't yet the case with the current state of the art of type checking for python.
That post should probably be taken lightly, but I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
For example, Django is large code base, and if you look at it, you will observe that the code is consistent in which features of python are used and how; this project passes the stricter type checking test without troubles. Likewise, Meta certainly has a very large code base (why develop a type checker otherwise?), and they must have figured out that they cannot let their programmers write code however they like; I guess their type checker is the stricter one for that reason.
Python, AFAIK, has many features, a very permissive runtime, and perhaps (not unlike C++) only some limited subset should be used at any time to ensure that the code is manageable. Unfortunately, that subset is probably different depending on who you ask, and what you aim to do.
(Interestingly, the Reddit post somehow reminded me of the hurdles Rust people have getting the Linux kernel guys to accept their practice: C has a much simpler and carefree type system, but Rust being much more strict rubs those C guys the wrong way).
The top comment in that post shuts down the whole nonsense pretty quickly and firmly:
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
It's a stupid example, but even within the context of a `slow_add` function in a library: maybe the author originally never even thought people would pass in non-numeric values, so in the next version update instead of a hardcoded `time.sleep(0.1)` they decide to `time.sleep(a / b)`. Oops, now it crashes for users who passed in strings or tuples! If only there were a way to declare that the function is only intended to work with numeric values, instead of forcing yourself to provide backwards compatibility for users who used that function in unexpected ways that happened to work.
IMO: for Python meant to run non-interactively with any sort of uptime guarantees, type checking is a no-brainer. You're actively making a mistake if you choose to not add type checking.
As the author of that post, I'd like to point out the example was meant to be stupid.
The purpose was to show different ideologies and expectations on the same code don't work, such as strict backwards compatibilities, duck typing, and strictly following linting or type hinting rules (due to some arbitrary enforcement). Although re-reading it now I wish I'd spent more than an evening working on it, it's full of issues and not very polished.
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
Following the general stupidness of the post: they are now unable to do that because a security consultant said they have to enable and can not break RUFF rule ANN401: https://docs.astral.sh/ruff/rules/any-type/
One thing that post does do though is very clearly highlight the difference between Python's type system and say ... TypeScript's.
TypeScript's goal is to take a language with an unhinged duck type system that allows people to do terrible things and then allow you to codify and lock in all of those behaviours exactly as they're used.
Mypy (and since it was written by GVM and codified in the stdlib by extension Python and all other typecheckers)'s goal is to take a language with an unhinged duck type system that allows people to do terrible things and then pretend that isn't the case and enforce strict academic rules and behaviours that don't particularly care about how real people write code and interact with libraries.
If you include type hints from the very beginning than you are forced to use the very limited subset of behaviours that mypy allow you to codify and everything will be "fine".
If you try to add type hints to a mature project, you will scream with frustration as you discover how many parts of the codebase literally cannot be represented in the extremely limited type system.
At this point I'm fairly convinced that the effort one would spend trying to typecheck a python program is better spent migrating away from python into a language that has a proper type system, then using interop so you can still have the bits/people that need python be in python.
Obviously that isn't always possible but you can spend far too long trying to make python work.
I think you're forgetting how easy type annotation is.
I occasionally spend like 2h working on some old python code. I will spend say 15 minutes of that time adding type annotations (sometimes requires some trivial refactoring). This has an enormous ROI, the cost is so low and the benefit is so immediate.
In these cases migrating code to a proper language and figuring out interop is not on my radar, it would be insane. So having the option to get some best-effort type safety is absolutely fantastic.
I can definitely see your point, it's a useful analysis for projects under heavy development. But if you have a big Python codebase that basically just works and only sees incremental changes, adding type annotations is a great strategy.
If you're supposedly good at software and you spent too long trying to make python work consider the possibility that you're not good at software?
Python has flaws and big ones at that, but there's a reason it's popular. Especially with tools like pydantic and fastapi and uv (and streamlit) you can do insane things in hours what would take weeks and months before. Not to mention how good AI is at generating code in these frameworks. I especially like typing using pydantic, any method is now able to dump and load data from files and dbs and you get extremely terse validated code. Modern IDEs also make quick work of extracting value even from partially typed code. I'd suggest you just open your mind up to imperfect things and give them a shot.
Six month into learning to build a modern python app, with linters, type systems, tests, venvs, package managers, etc… I realized that the supposed difficulty of rust is drastically less than coming to speed and then keeping up with the python “at scale” ecosystem.
I don't understand this point at all. I've worked on Django codebases which have a huge set of typing problems... and while it's not 100% I get a lot of value out of type checking.
You annotate enough functions and you get a really good linter out of it!
If you do that you need to compile, which means you can't just distribute a text file with your python program. You need a build infrastructure for every python version, every architecture and every OS.
> Python, AFAIK, has many features, a very permissive runtime, and perhaps (not unlike C++) only some limited subset should be used at any time to ensure that the code is manageable. Unfortunately, that subset is probably different depending on who you ask, and what you aim to do.
I'll get started on the subset of Python that I personally do not wish to use in my own codebase: meta classes, descriptors, callable objects using __call__, object.__new__(cls), names trigger the name mangling rules, self.__dict__. In my opinion, all of the above features involve too much magic and hinder code comprehension.
* Meta classes: You're writing Pydantic or an ORM.
* Descriptors: You're writing Pydantic or an ORM.
* Callable objects: I've used these for things like making validators you initialize with their parameters in one place, then pass them around so other functions can call them. I'd probably just use closures if at all possible now.
* object.__new__: You're writing Pydantic or an ORM.
* Name mangling: I'm fine with using _foo and __bar where appropriate. Those are nice. Don't ever, ever try to de-mangle them or I'll throw a stick at you.
* self.__dict__: You're writing Pydantic or an ORM, although if you use this as shorthand for "doing things that need introspection", that's a useful skill and not deep wizardry.
Basically, you won't need those things 99.99% of the time. If you think you do, you probably don't. If you're absolutely certain you do, you might. It's still good and important to understand what they are, though. Even if you never write them yourself, at some point you're going to want to figure out why some dependency isn't working the way you expected, and you'll need to read and know what it's doing.
Can you share a little bit about what makes you form opinions when you are not even using the language? I think its fascinating how especially discussions about typing makes people shake their fists against a language they don't even use - and like your post make up some contrived example.
>I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
Like - what makes you think that python developers doesn't understand stuff about Python, when they are actively using the language as opposed to you?
Indeed, I'm not a regular Python practitioner. I had to use it from time to time because it's the language chosen by the tools I happened to use at that time, like Blender, or Django. In the former case, it wasn't very enjoyable (which says a lot about my skills in that area, or rather lack thereof), while in the latter case I found it quite likeable. So that's my background as far as python goes.
I must admit that I largely prefer static typing, which is why I got interested in that article. It's true that trying to shoehorn this feature in the Python ecosystem is an uphill battle: there's a lot of good engineering skill spent on this.
Perhaps there's a connection to make between this situation and an old theorem about incompleteness?
Also not creating custom, expressive Pydantic types and using nested dicts in places. Nested dicts suck, you never know what you're getting, and it's well worth the time converting them to classes.
> When you add type hints to your library's arguments, you're going to be bitten by Hyrum's Law and you are not prepared to accurately type your full universe of users
That's understandable. But they're making breaking changes, and those are just breaking change pains - it's almost exactly the same if they had instead done this:
def slow_add(a, b):
throw TypeError if !isinstance(a, int)
...
but anyone looking at that would say "well yeah, that's a breaking change, of course people are going to complain".
The only real difference here is that it's a developer-breaking change, not a runtime-breaking one, because Python does not enforce type hints at runtime. Existing code will run, but existing tools looking at the code will fail. That offers an easier workaround (just ignore it), but is otherwise just as interruptive to developers because the same code needs to change in the same ways.
---
In contrast! Libraries can very frequently add types to their return values and it's immediately useful to their users. You're restricting your output to only the values that you already output - essentially by definition, only incorrect code will fail when you do this.
> ty, on the other hand, follows a different mantra: the gradual guarantee. The principal idea is that in a well-typed program, removing a type annotation should not cause a type error. In other words: you shouldn’t need to add new types to working code to resolve type errors.
The gradual guarantee that Ty offers is intriguing. I’m considering giving it a try based on that.
With a language like Python with existing dynamic codebases, it seems like the right way to do gradual typing.
Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning. Even in critical code you thought was fully typed. Where you mistakenly introduce a type bug and due to some syntax or inference limits the type checker unexpectedly loses the plot and tells you confidently "no problems in this file!"
I get where they're coming from, but the endgame was a huge issue when I tried mypy - there was no way to actually guarantee that you were getting any protection from types. A way to assert "no graduality to this file, it's fully typed!" is critical, but gradual typing is not just about migrating but also about the crazy things you can do in dynamic languages and being terrified of false positives scaring away the people who didn't value static typing in the first place. Maybe calling it "soft" typing would be clearer.
I think gradual typing is an anti-pattern at this point.
> Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning. Even in critical code you thought was fully typed. Where you mistakenly introduce a type bug and due to some syntax or inference limits the type checker unexpectedly loses the plot and tells you confidently "no problems in this file!"
This is a good point, and one that we are taking into account when developing ty.
The benefit of the gradual guarantee is that it makes the onboarding process less fraught when you want to start (gradually) adding types to an untyped codebase. No one wants a wall of false positive errors when you first start invoking your type checker.
The downside is exactly what you point out. For this, we want to leverage that ty is part of a suite of tools that we're developing. One goal in developing ty is to create the infrastructure that would let ruff support multi-file and type-aware linter rules. That's a bit hand-wavy atm, since we're still working out the details of how the two tools would work together.
So we do want to provide more opinionated feedback about your code — for instance, highlighting when implicit `Any`s show up in an otherwise fully type-annotated function. But we view that as being a linter rule, which will likely be handled by ruff.
> Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning.
That depends on the implementation of gradual typing. Elixir implements gradual set-theoretic types where dynamic types are a range of existing types and can be refined for typing violations. Here is a trivial example:
def example(x) do
{Integer.to_string(x), Atom.to_string(x)}
end
Since the function is untyped, `x` gets an initial value of `dynamic()`, but it still reports a typing violation because it first gets refined as `dynamic(integer())` which is then incompatible with the `atom()` type.
We also introduced the concept of strong arrows, which allows dynamic and static parts of a codebase to interact without introducing runtime checks and remaining sound. More information here: https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradua...
As mentioned in other comments - in TypeScript which follows this gradual typing there is a number of flags to disable it (gradually so to speak). No reason ty wouldn't do it.
Responding to your gradual typing anti-pattern bit: Agree that dynamic language behaviors can be extreme but it’s also easy to get into crazy type land. Putting aside a discussion of type systems, teams can always add runtime checks like pydantic to ensure your types match reality.
Sorbet (Ruby typechecker) does this where it introduces a runtime checks on signatures.
In code where you really want to have these guarantees you turn on errors lke "no implicit any" in mypy and tighten the restrictions on the files you care about.
You still have the "garbage in/garbage out" problem on the boundaries but at the very least you can improve confidence. And if you're hardcore... turn that on all over, turn off explicit Any, write wrappers around all of your untyped dependencies etc etc. You can get what you want, just might be a lot of work
Yeah, I’m torn because, in my experience, gradual typing means the team members who want it implement it in their code and the others do not or are very lax in their typing. Some way of swapping between gradual and strict would be nice.
Unless you're doing greenfield, gradual typing is really the only way. I've incorporated type hinting in several legacy Python code bases with mypy and really the only sensible way is to "opt-in" one module at a time. If pyrefly doesn’t support that I think its use will be pretty limited. Unless maybe they are going for the llm code gen angle. I could see a very fast and strict type checker being useful for llm generating python scripts.
It reminds me of the early days of Typescript rollout, which similarly focused on a smooth on-boarding path for existing large projects.
More restrictive requirements (ie `noImplicitAny`) could be turned on one at a time before eventually flipping the `strict` switch to opt in to all the checks.
Although I'm paid to write (among other things) Python and not Rust, I would think of myself as a Rust programmer and to me the gradual guarantee also makes most sense.
This is a big turnoff for me. Half the point of adding type annotations to Python is to tame its error-prone dynamic typing. I want to know when I've done something stupid, even if it is technically allowed by Python itself.
Hopefully they'll add some kind of no-implicit-any or "strict" mode for people who care about having working code...
> pyrefly, mypy, and pyright all assume that my_list.append("foo") is a typing error, even though it is technically allowed (Python collections can have multiple types of objects!)
> If this is the intended behavior, ty is the only checker that implicitly allows this without requiring additional explicit typing on my_list.
EDIT: I didn't intend my comment to be this sharp, I am actually rooting for ty to succeed :)
ORIGINAL: I am strongly against ty behaviour here. In production code you almost always have single type lists and it is critical that the typechecker assumes this, especially if the list already has same-type _literal_ items.
The fact that Python allows this has no bearing at all. To me having list[int | str] implicitly allowed by the typechecker seems like optimizing for beginner-level code.
In this particular example, we are tripped up because ty does not do anything clever to infer the type of a list literal. We just infer `list[Unknown]` as a placeholder, regardless of what elements are present. `Unknown` is a gradual type (just like `Any`), and so the `append` call succeeds because every type is assignable to `Unknown`.
We do have plans for inferring a more precise type of the list. It will be more complex than you might anticipate, since it will require "bidirectional" typing to take into account what you're doing with the list in the surrounding context. We have a tracking issue for that here: https://github.com/astral-sh/ty/issues/168
I hope I didn't come off as angry or anything, I was just very surprised by the behaviour :)
I am talking from some experience as I had to convert circa 40k lines of untyped code (dicts passed around etc) to fully typed. IIRC this behaviour would have masked a lot of bugs in my situation. (I relied on mypy at first, but migrated to pyright about 1/4 in).
But otherwise it's good to hear that this is still in progress and I wish the project the best of luck.
So, how does that relate to this quote from the article?
>ty, on the other hand, follows a different mantra: the gradual guarantee. The principal idea is that in a well-typed program, removing a type annotation should not cause a type error. In other words: you shouldn’t need to add new types to working code to resolve type errors.
It seems like `ty`'s current behaviour is compatible with this, but changing it won't (unless it will just be impossible to type a list of different types).
I don't think it's optimizing for beginner-level code, I think it's optimizing for legacy code. Introducing a type checker to a large existing untyped codebase is a big lift, but becomes less of one if almost all existing code is accepted.
Well then support an option to enable that kind behaviour? Make it an explicit decision by the devs. I think running in a type error and then adding an exception to your config is safer than silently pass and only learn about the mixed types in a production bug
list[int | str] might usually be a mistake, but what about
my_list = [BarWidget(...), FooWidget(...)] ?
my_list.append(BazWidget(...))
my_list.append(7)
Wouldn't it be nice if the type checker could infer the type hint there, which is almost certainly intended to be list[Widget], and allow the first append and flag the second one?
The problem with the pyrefly behavior is that if you have a large codebase that isn't using any sort of Python typechecking, you can't just adopt this tool incrementally. You have to go fix up all of these issues. So you need to get widespread support for this migration.
For an internal tool at Meta, this is fine. Just make all your engineers adopt the style guide.
For introducing a tool gradually at an organization where this sort of change isn't one of the top priorities of engineering leadership, being more accepting is great. So I prefer the way ty does this, even though in my own personal code I would like my tool to warn me if I mix types like this.
>The fact that Python allows this has no bearing at all. To me having list[int | str] implicitly allowed by the typechecker seems like optimizing for beginner-level code.
Yes, lets base our tooling on your opinion rather what is allowed in python.
I am strongly for ty's behaviour here. working python code should not raise type errors unless the user explicitly opts in to a more static subset of the language by adding type annotations.
Because to me this seems like a fantastic example of a highly possible mistake that a typechecker _should_ catch. Without defined types in this situation a couple of things could happen: 1) it gets printed or passed to some other Any method and the typechecker never yells at you and it crashes in production 2) the typechecker catches the error somewhere long down the line and you have to backtrack to find where you might be appending a str to a list[int].
Instead it could mark it as an error (as all the other checkers do), and if that's what the user really intended they can declare the type as list[str | int] and everything down the line is checked correctly.
So in short, this seems like a great place to start pushing the user towards actually (gradually) typing their code, not just pushing likely bugs under the rug.
It depends on what happens with the list after that. Are there int specific operations applied or it is just printed? What if it is fed into objects with a str attribute where the ints could be cast to str?
I don't know. I would argue that since type checking in python is optional, the type checkers shouldn't care unless the programmer cares. A more interesting case would be my_list.append(2.45) or my_list.append(Decimal("2.0")). Those cases would be "numbers" not just "ints".
In the real world, a row of CSV data is not type checked -- and the world hasn't pushed the spreadsheet industry to adopt typed CSV data.
Astral tooling is great and brings new energy into python land but what is the long game of all astral projects?
Integrate them into python natively?
Be gone in 5 years and leave unmaintained tooling behind?
Rug pull all of us with a subscription?
They'll most likely pursue some sort of business source licensing, where you will not be allowed to deploy apps in production using their tooling without the business paying some kind of subscription. I understand that none of their existing products fit this use case, but it will probably be a similar approach. VCs are not charities.
> The standard VC business model is to invest in stuff that FAANG will buy from them one day. The standard approach is to invest in stuff that's enough of a threat to FAANG that they'll buy it to kill it, but this seems more like they're gambling on an acqui-hire in the future.
I don’t think any of these questions are specific to Astral and can be applied to pretty much any project. ‘Be gone in 5 years and leave unmaintained tooling’ seems particularly plausible with regard to Facebook’s tooling.
I really wish they get first class Django support. Sadly, its ORM architecture is impossible to type and impossible to change now. Django is one of the most important use cases for Python, having fast full type checking with Django is a dream, but it does require some special casing from the type checker.
I'm with you. I regularly hit errors with its ORM that make me think: "I thought I cast this class of errors aside years ago". I go over my query code very carefully, since the MK-1 eyeball is important here for spotting typos etc.
(I'm not commenting on it being possible or not to fix; but the current status)
It uses a huge amount of what I’m terming “getattr bullshit”: many/most fields of ORM objects are only determined at runtime (they’re technically determinABLE at early runtime during Django initialization, but in practice are often not actually visible via reflection until they are first used due to lazy caching).
What fields are present and what types they have is extremely non uniform: it depends heavily on ORM objects’ internal configuration and the way a given model class relates to other models, including circular dependencies.
(And when I say “fields” here, I’m not only referring to data fields; even simple models include many, many computed method-like fields, complex lazily-evaluatable and parametrizable query objects, fields whose types and behavior change temporally or in response to far-distant settings, and more).
Some of this complexity is inherent to what ORMs are as a class of tool—many ORMs in all sorts of languages provide developer affordances in the form of highly dynamic, metaprogramming-based DSL-ish APIs—but Django really leans in to that pattern more than most.
Add to that a very strong community tendency to lazily (as in diligence, not caching) subclass ORM objects in ways that shadow computed fields—and often sloppily override the logic used to compute what fields are available and how they act—and you have a very thorny problem space for type checkers.
I also want to emphasize that this isn’t some rare Django power-user functionality that is seldom used, nor is it considered deprecated or questionable—these computed fields are the core API of the Django ORM, so not only are they a moving target that changes with Django (and Django extension module) releases, but they’re also such a common kind of code that even minor errors in attempts to type-check them will be extremely visible and frustrating to a wide range of users.
None of that should be taken as an indictment of the Django ORM’s design (for the most part I find it quite good, and most of my major issues with it have little to do with type checking). Just trying to answer the question as directly as possible.
It's possible to write a Django typecheck shim using descriptors. There's some annoying stuff on the edges though, and for example if you are changing up fields in `__init__` then those aren't going to show up in your types.
Ultimately you can get typing for the usual cases, but it won't be complete because you can outright change the shape of your models in Django at runtime (actions that aren't type safe of course)
I’ve built a few typecheckers (in different languages) that hew closer to Pyrefly’s behaviour than Ty’s behaviour.
If you have a large codebase that you want to be typesafe, Pyrefly’s approach means writing far fewer type annotations overall, even if the initial lift is much steeper.
As they're described here, Pyrefly's design choices make more sense to me. I like the way Typescript does type inference and it seems like Pyrefly is closer to that. Module-level incrementalism also seems like a good tradeoff. Fine-grained incrementalism on a function level seems like overkill. Performance should be good enough that it's not required.
I hope some typechecker starts doing serious supported notebook integration. And integration for live coding, not just a batch script to statically check your notebook. Finding errors with typing before running a 1-60 minute cell is a huge win.
Do you use Jupyter notebooks in VSCode? It uses the same pylance as regular python files, which actually gets annoying when I want to write throwaway code.
Anyone reading this, if you're like me and prefer the open source version of VSCode where Microsoft disables Pylance, I'd encourage you to try BasedPyright instead.
I echo the other response here. You absolutely should switch to using notebooks in VSCode with their static typenchecker. Language Servers do exactly what you are wanting, with both notebook integration and «live coding».
for decades, big tech contributed relatively little in the way of python ecosystem tooling. There’s Facebooks Pyre, but that’s about it. Nothing for package/dependency management, linting, formatting, so folks like those at Astral have stepped up to fill the gap.
why is type checking the exception? with google and facebook and astral all writing their own mypy replacements, i’m curious why this space is suddenly so busy
Coming from a Meta background (not speaking on behalf of Meta):
"package/dependency management" - Everything is checked into a monorepo, and built with [Buck2](https://buck2.build/). There's tooling to import/update packages, but no need to reinvent pip or other package managers. Btw, Buck2 is pretty awesome and supports a ton of languages beyond python, but hasn't gotten a ton of traction outside of Meta.
"linting, formatting" - [Black](https://github.com/psf/black) and other public ecosystem tooling is great, no need to develop internally.
"why is type checking the exception" - Don't know about Astral, but for Meta / Google, most everyone else doesn't design for the scale of their monorepos. Meta moved from SVN to Git to Mercurial, then forked Mercurial into [Sapling](https://sapling-scm.com/) because simple operations were too slow for the number of files in their repo, and how frequently they receive diffs.
There are obvious safety benefits to type checking, but with how much Python code Meta has, mypy is not an option - it would take far too much time / memory to provide any value.
Probably because a large amount of AIs are churning out Python code, and they need type-checkers to sanitize/validate that output quickly. Dynamic languages are hard enough for people to make sense of half the time, and I bet AI agents are struggling even more.
What was not immediately obvious to me (but should have been) is that these are dev-time type checkers -- I think. (I think, both from from the github descriptions which focus heavily on editing, and from the article.) That's really useful because type inference is lacking, to me, in-editor. I tend to ask Copilot: 'add type annotations'.
So in complement to this can I share my favorite _run-time_ type checker? Beartype: this reads your type annotations (ie I see this is where Pyrefly and Ty come in), and enforces the types at runtime. It is blazingly fast, as in, incredibly fast. I use it for all my deployed code.
beartype is great, but I only find it useful at the edges. Runtime checks aren't needed if you have strict typing throughout the project. On a gradually-typed codebase, you can use beartype (e.g. is_bearable) to ensure the data you're ingesting has the proper type. I usually use it when I'm dealing with JSON types.
Why isn’t it necessary? Do you mean that with edit-time type checking, you can catch all errors, so no need for runtime verification the edit-time type decls match?
Any progress in the Python ecosystem on static checking for things like tensors and data frames? As I understand it, the comments at the bottom of this FAQ still apply:
I have a problem with Python's `Optional` type. For example for this following code:
from typing import Optional, Union
def square(
a: Union[int, float],
b: Optional[int] = 2
) -> float:
c = a**b
return c
Many type checkers throw an error because `Optional[int]` actually means `int | None` and you cannot square an `int` or a `float` with a `None`. Is there any plans for *ty* around this?
I'd like to experiment with writing a reflection based dynamic type annotator for one of these. Just as a toy. Imagine a module which monkey patches every function, class in the parent module it is loaded into, then reflects on the arguments at run time slowly building up a database of expected type signatures. The user would then pick through the deltas approving what made sense.
This is a neat idea, but for large systems it could take a long time to fully exercise every code path with all possible input data. (If this sounds way too excessive, that's because normal people don't do this.)
Unless you mean something like record prod for a few weeks to months, similar to how Netflix uses eBPF (except they run it all the time).
I think my main use case would be to run it on unit tests, and then to provide a human-in-the-loop way to slowly hydrate small to medium sized code bases with types. Maybe progressive refactor tools aren't something anyone actually wants to use or maintain, because I just don't see them around that much.
Not yet tbh. I would realistically expect at least a year or more before you can expect any kind of parity with eg; pyright/basedpyright. Type checkers are hard and have a long tail of functionality that must be implemented.
ty is definitely not ready to be a pyright replacement yet. But it is usable as an LSP for simple things like go to definition, and deeper LSP features are on the roadmap for the eventual beta and GA releases.
As someone who has added type hints to two huge code bases that had nine. It's not that painful. Something much more painful is realizing the code base you are adding type hints to is irreconcilably bugged by design, which would not have been possible had type checking been used.
[ty developer here]
We are happy with the attention that ty is starting to receive, but it's important to call out that both ty and pyrefly are still incomplete! (OP mentions this, but it's worth emphasizing again here.)
There are definitely examples cropping up that hit features that are not yet implemented. So when you encounter something where you think what we're doing is daft, please recognize that we might have just not gotten around to that yet. Python is a big language!
Really loving those markdown style tests. I think it's a really fantastic idea that allows the tests to easily act as documentation too.
Can you explain how you came up with this solution? Rust docs code-examples inspired?
That concept has been formalized as part of the Python standard library.
https://docs.python.org/3/library/doctest.html
4 replies →
Elixir has this.
https://hexdocs.pm/elixir/main/docs-tests-and-with.html
I use this in my books to show the output but also to "test" that the code found in my books actually works.
I love doctest as it works so well with a REPL but unfortunately it hasn't really gained traction anywhere I've seen.
surfacing revealed types as `@TODO` made me laugh, but thinking about it it's actually a pretty neat touch!
It really helps in our mdtests, because then we can assert that not-implemented things are currently wrong but for the right reasons!
Totally orthogonal question, but since you're deep in that side of Rust dev -
The subject of a "scripting language for Rust" has come up a few times [1]. A language that fits nicely with the syntax of Rust, can compile right alongside rust, can natively import Rust types, but can compile/run/hot reload quickly.
Do you know of anyone in your network working on that?
And modulus the syntax piece, do you think Python could ever fill that gap?
[1] https://news.ycombinator.com/item?id=44050222
I don't know that I'd want the scripting language to be compiled, for reasons that are outside the scope of this reply. So removing that constraint, the coolest thing I've seen in this space recently is kyren's Piccolo:
https://kyju.org/blog/piccolo-a-stackless-lua-interpreter/
> And modulus the syntax piece, do you think Python could ever fill that gap?
I would never ever want a full fledged programming language to build type checking plugins, and doubly so in cases where one expects the tool to run in a read-write context
I am not saying that Skylark is the solution, but it's sandboxed mental model aligns with what I'd want for such a solution
I get the impression the wasm-adjacent libraries could also help this due to the WASI boundary already limiting what mutations it is allowed
There's Gluon, which doesn't share Rust's syntax but does have a Hindley-Milner-based type system and embeds pretty seamlessly in a Rust program.
https://github.com/gluon-lang/gluon
1 reply →
Most of the time, you want the type to be dynamic in a scripting langage, as you don't want to expose the types to the user. With this in mind, rhai and rune are pretty good. On the python front, there was also the pyoxidizer thing, put it seems dead.
2 replies →
I am very interested in both of these. Coming from the TypeScript world I'm really interested in the different directions (type inference or not, intersections and type narrowing...). As a Python developer I'm wearily resigned to there being 4+ python type checkers out there, all of which behave differently. How very python...
Following these projects with great interest though. At the end of the day, a good type checker should let us write code faster and more reliably, which I feel isn't yet the case with the current state of the art of type checking for python.
Good luck with the project!
I am not well versed in python programming, this is just my opinion as an outsider.
For anyone interested in using these tools, I suggest reading the following:
https://www.reddit.com/r/Python/comments/10zdidm/why_type_hi...
That post should probably be taken lightly, but I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
For example, Django is large code base, and if you look at it, you will observe that the code is consistent in which features of python are used and how; this project passes the stricter type checking test without troubles. Likewise, Meta certainly has a very large code base (why develop a type checker otherwise?), and they must have figured out that they cannot let their programmers write code however they like; I guess their type checker is the stricter one for that reason.
Python, AFAIK, has many features, a very permissive runtime, and perhaps (not unlike C++) only some limited subset should be used at any time to ensure that the code is manageable. Unfortunately, that subset is probably different depending on who you ask, and what you aim to do.
(Interestingly, the Reddit post somehow reminded me of the hurdles Rust people have getting the Linux kernel guys to accept their practice: C has a much simpler and carefree type system, but Rust being much more strict rubs those C guys the wrong way).
The top comment in that post shuts down the whole nonsense pretty quickly and firmly:
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
It's a stupid example, but even within the context of a `slow_add` function in a library: maybe the author originally never even thought people would pass in non-numeric values, so in the next version update instead of a hardcoded `time.sleep(0.1)` they decide to `time.sleep(a / b)`. Oops, now it crashes for users who passed in strings or tuples! If only there were a way to declare that the function is only intended to work with numeric values, instead of forcing yourself to provide backwards compatibility for users who used that function in unexpected ways that happened to work.
IMO: for Python meant to run non-interactively with any sort of uptime guarantees, type checking is a no-brainer. You're actively making a mistake if you choose to not add type checking.
As the author of that post, I'd like to point out the example was meant to be stupid.
The purpose was to show different ideologies and expectations on the same code don't work, such as strict backwards compatibilities, duck typing, and strictly following linting or type hinting rules (due to some arbitrary enforcement). Although re-reading it now I wish I'd spent more than an evening working on it, it's full of issues and not very polished.
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
Following the general stupidness of the post: they are now unable to do that because a security consultant said they have to enable and can not break RUFF rule ANN401: https://docs.astral.sh/ruff/rules/any-type/
13 replies →
One thing that post does do though is very clearly highlight the difference between Python's type system and say ... TypeScript's.
TypeScript's goal is to take a language with an unhinged duck type system that allows people to do terrible things and then allow you to codify and lock in all of those behaviours exactly as they're used.
Mypy (and since it was written by GVM and codified in the stdlib by extension Python and all other typecheckers)'s goal is to take a language with an unhinged duck type system that allows people to do terrible things and then pretend that isn't the case and enforce strict academic rules and behaviours that don't particularly care about how real people write code and interact with libraries.
If you include type hints from the very beginning than you are forced to use the very limited subset of behaviours that mypy allow you to codify and everything will be "fine".
If you try to add type hints to a mature project, you will scream with frustration as you discover how many parts of the codebase literally cannot be represented in the extremely limited type system.
At this point I'm fairly convinced that the effort one would spend trying to typecheck a python program is better spent migrating away from python into a language that has a proper type system, then using interop so you can still have the bits/people that need python be in python.
Obviously that isn't always possible but you can spend far too long trying to make python work.
I think you're forgetting how easy type annotation is.
I occasionally spend like 2h working on some old python code. I will spend say 15 minutes of that time adding type annotations (sometimes requires some trivial refactoring). This has an enormous ROI, the cost is so low and the benefit is so immediate.
In these cases migrating code to a proper language and figuring out interop is not on my radar, it would be insane. So having the option to get some best-effort type safety is absolutely fantastic.
I can definitely see your point, it's a useful analysis for projects under heavy development. But if you have a big Python codebase that basically just works and only sees incremental changes, adding type annotations is a great strategy.
If you're supposedly good at software and you spent too long trying to make python work consider the possibility that you're not good at software?
Python has flaws and big ones at that, but there's a reason it's popular. Especially with tools like pydantic and fastapi and uv (and streamlit) you can do insane things in hours what would take weeks and months before. Not to mention how good AI is at generating code in these frameworks. I especially like typing using pydantic, any method is now able to dump and load data from files and dbs and you get extremely terse validated code. Modern IDEs also make quick work of extracting value even from partially typed code. I'd suggest you just open your mind up to imperfect things and give them a shot.
Six month into learning to build a modern python app, with linters, type systems, tests, venvs, package managers, etc… I realized that the supposed difficulty of rust is drastically less than coming to speed and then keeping up with the python “at scale” ecosystem.
1 reply →
I don't understand this point at all. I've worked on Django codebases which have a huge set of typing problems... and while it's not 100% I get a lot of value out of type checking.
You annotate enough functions and you get a really good linter out of it!
Unfortunately with us being in the middle of the AI hype cycle, everyone and their dog is currently busy migrating to python.
6 replies →
If you do that you need to compile, which means you can't just distribute a text file with your python program. You need a build infrastructure for every python version, every architecture and every OS.
Have fun with that!
> Python, AFAIK, has many features, a very permissive runtime, and perhaps (not unlike C++) only some limited subset should be used at any time to ensure that the code is manageable. Unfortunately, that subset is probably different depending on who you ask, and what you aim to do.
I'll get started on the subset of Python that I personally do not wish to use in my own codebase: meta classes, descriptors, callable objects using __call__, object.__new__(cls), names trigger the name mangling rules, self.__dict__. In my opinion, all of the above features involve too much magic and hinder code comprehension.
There's a time and a place for each of them:
* Meta classes: You're writing Pydantic or an ORM.
* Descriptors: You're writing Pydantic or an ORM.
* Callable objects: I've used these for things like making validators you initialize with their parameters in one place, then pass them around so other functions can call them. I'd probably just use closures if at all possible now.
* object.__new__: You're writing Pydantic or an ORM.
* Name mangling: I'm fine with using _foo and __bar where appropriate. Those are nice. Don't ever, ever try to de-mangle them or I'll throw a stick at you.
* self.__dict__: You're writing Pydantic or an ORM, although if you use this as shorthand for "doing things that need introspection", that's a useful skill and not deep wizardry.
Basically, you won't need those things 99.99% of the time. If you think you do, you probably don't. If you're absolutely certain you do, you might. It's still good and important to understand what they are, though. Even if you never write them yourself, at some point you're going to want to figure out why some dependency isn't working the way you expected, and you'll need to read and know what it's doing.
5 replies →
You should try Go!
1 reply →
Can you share a little bit about what makes you form opinions when you are not even using the language? I think its fascinating how especially discussions about typing makes people shake their fists against a language they don't even use - and like your post make up some contrived example.
>I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
Like - what makes you think that python developers doesn't understand stuff about Python, when they are actively using the language as opposed to you?
Indeed, I'm not a regular Python practitioner. I had to use it from time to time because it's the language chosen by the tools I happened to use at that time, like Blender, or Django. In the former case, it wasn't very enjoyable (which says a lot about my skills in that area, or rather lack thereof), while in the latter case I found it quite likeable. So that's my background as far as python goes.
I must admit that I largely prefer static typing, which is why I got interested in that article. It's true that trying to shoehorn this feature in the Python ecosystem is an uphill battle: there's a lot of good engineering skill spent on this.
Perhaps there's a connection to make between this situation and an old theorem about incompleteness?
https://copilot.microsoft.com/shares/2LpT2HFBa3m6jYxUhk9fW
(was generated in quick mode, so you might want to double check).
As someone who has been writing python for years the worst mistake I have ever seen people make is not add type hints and not using a type checker.
Also not creating custom, expressive Pydantic types and using nested dicts in places. Nested dicts suck, you never know what you're getting, and it's well worth the time converting them to classes.
1 reply →
To try to tl;dr that rather long post:
> When you add type hints to your library's arguments, you're going to be bitten by Hyrum's Law and you are not prepared to accurately type your full universe of users
That's understandable. But they're making breaking changes, and those are just breaking change pains - it's almost exactly the same if they had instead done this:
but anyone looking at that would say "well yeah, that's a breaking change, of course people are going to complain".
The only real difference here is that it's a developer-breaking change, not a runtime-breaking one, because Python does not enforce type hints at runtime. Existing code will run, but existing tools looking at the code will fail. That offers an easier workaround (just ignore it), but is otherwise just as interruptive to developers because the same code needs to change in the same ways.
---
In contrast! Libraries can very frequently add types to their return values and it's immediately useful to their users. You're restricting your output to only the values that you already output - essentially by definition, only incorrect code will fail when you do this.
> ty, on the other hand, follows a different mantra: the gradual guarantee. The principal idea is that in a well-typed program, removing a type annotation should not cause a type error. In other words: you shouldn’t need to add new types to working code to resolve type errors.
The gradual guarantee that Ty offers is intriguing. I’m considering giving it a try based on that.
With a language like Python with existing dynamic codebases, it seems like the right way to do gradual typing.
Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning. Even in critical code you thought was fully typed. Where you mistakenly introduce a type bug and due to some syntax or inference limits the type checker unexpectedly loses the plot and tells you confidently "no problems in this file!"
I get where they're coming from, but the endgame was a huge issue when I tried mypy - there was no way to actually guarantee that you were getting any protection from types. A way to assert "no graduality to this file, it's fully typed!" is critical, but gradual typing is not just about migrating but also about the crazy things you can do in dynamic languages and being terrified of false positives scaring away the people who didn't value static typing in the first place. Maybe calling it "soft" typing would be clearer.
I think gradual typing is an anti-pattern at this point.
> Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning. Even in critical code you thought was fully typed. Where you mistakenly introduce a type bug and due to some syntax or inference limits the type checker unexpectedly loses the plot and tells you confidently "no problems in this file!"
This is a good point, and one that we are taking into account when developing ty.
The benefit of the gradual guarantee is that it makes the onboarding process less fraught when you want to start (gradually) adding types to an untyped codebase. No one wants a wall of false positive errors when you first start invoking your type checker.
The downside is exactly what you point out. For this, we want to leverage that ty is part of a suite of tools that we're developing. One goal in developing ty is to create the infrastructure that would let ruff support multi-file and type-aware linter rules. That's a bit hand-wavy atm, since we're still working out the details of how the two tools would work together.
So we do want to provide more opinionated feedback about your code — for instance, highlighting when implicit `Any`s show up in an otherwise fully type-annotated function. But we view that as being a linter rule, which will likely be handled by ruff.
4 replies →
> Gradual typing means that an implicit "any" (unknown type) anywhere in your code base is not an error or even a warning.
That depends on the implementation of gradual typing. Elixir implements gradual set-theoretic types where dynamic types are a range of existing types and can be refined for typing violations. Here is a trivial example:
Since the function is untyped, `x` gets an initial value of `dynamic()`, but it still reports a typing violation because it first gets refined as `dynamic(integer())` which is then incompatible with the `atom()` type.
We also introduced the concept of strong arrows, which allows dynamic and static parts of a codebase to interact without introducing runtime checks and remaining sound. More information here: https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradua...
4 replies →
As mentioned in other comments - in TypeScript which follows this gradual typing there is a number of flags to disable it (gradually so to speak). No reason ty wouldn't do it.
Responding to your gradual typing anti-pattern bit: Agree that dynamic language behaviors can be extreme but it’s also easy to get into crazy type land. Putting aside a discussion of type systems, teams can always add runtime checks like pydantic to ensure your types match reality.
Sorbet (Ruby typechecker) does this where it introduces a runtime checks on signatures.
Similarly in ts, we have zod.
5 replies →
In code where you really want to have these guarantees you turn on errors lke "no implicit any" in mypy and tighten the restrictions on the files you care about.
You still have the "garbage in/garbage out" problem on the boundaries but at the very least you can improve confidence. And if you're hardcore... turn that on all over, turn off explicit Any, write wrappers around all of your untyped dependencies etc etc. You can get what you want, just might be a lot of work
Yeah, I’m torn because, in my experience, gradual typing means the team members who want it implement it in their code and the others do not or are very lax in their typing. Some way of swapping between gradual and strict would be nice.
15 seconds after doing "man mypy": --disallow-any-expr
Less than it took you to write all that.
Unless you're doing greenfield, gradual typing is really the only way. I've incorporated type hinting in several legacy Python code bases with mypy and really the only sensible way is to "opt-in" one module at a time. If pyrefly doesn’t support that I think its use will be pretty limited. Unless maybe they are going for the llm code gen angle. I could see a very fast and strict type checker being useful for llm generating python scripts.
It reminds me of the early days of Typescript rollout, which similarly focused on a smooth on-boarding path for existing large projects.
More restrictive requirements (ie `noImplicitAny`) could be turned on one at a time before eventually flipping the `strict` switch to opt in to all the checks.
Although I'm paid to write (among other things) Python and not Rust, I would think of myself as a Rust programmer and to me the gradual guarantee also makes most sense.
This is a big turnoff for me. Half the point of adding type annotations to Python is to tame its error-prone dynamic typing. I want to know when I've done something stupid, even if it is technically allowed by Python itself.
Hopefully they'll add some kind of no-implicit-any or "strict" mode for people who care about having working code...
> my_list = [1, 2, 3]
> pyrefly, mypy, and pyright all assume that my_list.append("foo") is a typing error, even though it is technically allowed (Python collections can have multiple types of objects!)
> If this is the intended behavior, ty is the only checker that implicitly allows this without requiring additional explicit typing on my_list.
EDIT: I didn't intend my comment to be this sharp, I am actually rooting for ty to succeed :)
ORIGINAL: I am strongly against ty behaviour here. In production code you almost always have single type lists and it is critical that the typechecker assumes this, especially if the list already has same-type _literal_ items.
The fact that Python allows this has no bearing at all. To me having list[int | str] implicitly allowed by the typechecker seems like optimizing for beginner-level code.
> I am strongly against ty behaviour here.
[ty developer here]
Please note that ty is not complete!
In this particular example, we are tripped up because ty does not do anything clever to infer the type of a list literal. We just infer `list[Unknown]` as a placeholder, regardless of what elements are present. `Unknown` is a gradual type (just like `Any`), and so the `append` call succeeds because every type is assignable to `Unknown`.
We do have plans for inferring a more precise type of the list. It will be more complex than you might anticipate, since it will require "bidirectional" typing to take into account what you're doing with the list in the surrounding context. We have a tracking issue for that here: https://github.com/astral-sh/ty/issues/168
I hope I didn't come off as angry or anything, I was just very surprised by the behaviour :)
I am talking from some experience as I had to convert circa 40k lines of untyped code (dicts passed around etc) to fully typed. IIRC this behaviour would have masked a lot of bugs in my situation. (I relied on mypy at first, but migrated to pyright about 1/4 in).
But otherwise it's good to hear that this is still in progress and I wish the project the best of luck.
1 reply →
So, how does that relate to this quote from the article?
It seems like `ty`'s current behaviour is compatible with this, but changing it won't (unless it will just be impossible to type a list of different types).
2 replies →
Have you all looked at how Pyrefly does it, or are your methods incompatible?
1 reply →
I don't think it's optimizing for beginner-level code, I think it's optimizing for legacy code. Introducing a type checker to a large existing untyped codebase is a big lift, but becomes less of one if almost all existing code is accepted.
Well then support an option to enable that kind behaviour? Make it an explicit decision by the devs. I think running in a type error and then adding an exception to your config is safer than silently pass and only learn about the mixed types in a production bug
2 replies →
list[int | str] might usually be a mistake, but what about
my_list = [BarWidget(...), FooWidget(...)] ?
my_list.append(BazWidget(...))
my_list.append(7)
Wouldn't it be nice if the type checker could infer the type hint there, which is almost certainly intended to be list[Widget], and allow the first append and flag the second one?
The problem with the pyrefly behavior is that if you have a large codebase that isn't using any sort of Python typechecking, you can't just adopt this tool incrementally. You have to go fix up all of these issues. So you need to get widespread support for this migration.
For an internal tool at Meta, this is fine. Just make all your engineers adopt the style guide.
For introducing a tool gradually at an organization where this sort of change isn't one of the top priorities of engineering leadership, being more accepting is great. So I prefer the way ty does this, even though in my own personal code I would like my tool to warn me if I mix types like this.
>The fact that Python allows this has no bearing at all. To me having list[int | str] implicitly allowed by the typechecker seems like optimizing for beginner-level code.
Yes, lets base our tooling on your opinion rather what is allowed in python.
I am strongly for ty's behaviour here. working python code should not raise type errors unless the user explicitly opts in to a more static subset of the language by adding type annotations.
> and it is critical that the typechecker assumes this
Why is it critical though? If having a `list[int]` was a requirement I would expect a type error where that's explicit.
Because to me this seems like a fantastic example of a highly possible mistake that a typechecker _should_ catch. Without defined types in this situation a couple of things could happen: 1) it gets printed or passed to some other Any method and the typechecker never yells at you and it crashes in production 2) the typechecker catches the error somewhere long down the line and you have to backtrack to find where you might be appending a str to a list[int].
Instead it could mark it as an error (as all the other checkers do), and if that's what the user really intended they can declare the type as list[str | int] and everything down the line is checked correctly.
So in short, this seems like a great place to start pushing the user towards actually (gradually) typing their code, not just pushing likely bugs under the rug.
It depends on what happens with the list after that. Are there int specific operations applied or it is just printed? What if it is fed into objects with a str attribute where the ints could be cast to str?
I don't know. I would argue that since type checking in python is optional, the type checkers shouldn't care unless the programmer cares. A more interesting case would be my_list.append(2.45) or my_list.append(Decimal("2.0")). Those cases would be "numbers" not just "ints".
In the real world, a row of CSV data is not type checked -- and the world hasn't pushed the spreadsheet industry to adopt typed CSV data.
Astral tooling is great and brings new energy into python land but what is the long game of all astral projects? Integrate them into python natively? Be gone in 5 years and leave unmaintained tooling behind? Rug pull all of us with a subscription?
They'll most likely pursue some sort of business source licensing, where you will not be allowed to deploy apps in production using their tooling without the business paying some kind of subscription. I understand that none of their existing products fit this use case, but it will probably be a similar approach. VCs are not charities.
As a Redditor said:
> The standard VC business model is to invest in stuff that FAANG will buy from them one day. The standard approach is to invest in stuff that's enough of a threat to FAANG that they'll buy it to kill it, but this seems more like they're gambling on an acqui-hire in the future.
I have never seen a FAANG company buy a pure programming-language based tooling startup.
1 reply →
Why would they do that when they can just fork for free?
1 reply →
I don’t think any of these questions are specific to Astral and can be applied to pretty much any project. ‘Be gone in 5 years and leave unmaintained tooling’ seems particularly plausible with regard to Facebook’s tooling.
Use any of them at your own risk I suppose.
The announcement talked about selling services built on top of the tools: https://astral.sh/blog/announcing-astral-the-company-behind-...
I think I heard somewhere that they are working on other tools that only big enterprises need like a hosted private package registry.
I really wish they get first class Django support. Sadly, its ORM architecture is impossible to type and impossible to change now. Django is one of the most important use cases for Python, having fast full type checking with Django is a dream, but it does require some special casing from the type checker.
I'm with you. I regularly hit errors with its ORM that make me think: "I thought I cast this class of errors aside years ago". I go over my query code very carefully, since the MK-1 eyeball is important here for spotting typos etc.
(I'm not commenting on it being possible or not to fix; but the current status)
What makes the Django ORM impossible to type check?
It uses a huge amount of what I’m terming “getattr bullshit”: many/most fields of ORM objects are only determined at runtime (they’re technically determinABLE at early runtime during Django initialization, but in practice are often not actually visible via reflection until they are first used due to lazy caching).
What fields are present and what types they have is extremely non uniform: it depends heavily on ORM objects’ internal configuration and the way a given model class relates to other models, including circular dependencies.
(And when I say “fields” here, I’m not only referring to data fields; even simple models include many, many computed method-like fields, complex lazily-evaluatable and parametrizable query objects, fields whose types and behavior change temporally or in response to far-distant settings, and more).
Some of this complexity is inherent to what ORMs are as a class of tool—many ORMs in all sorts of languages provide developer affordances in the form of highly dynamic, metaprogramming-based DSL-ish APIs—but Django really leans in to that pattern more than most.
Add to that a very strong community tendency to lazily (as in diligence, not caching) subclass ORM objects in ways that shadow computed fields—and often sloppily override the logic used to compute what fields are available and how they act—and you have a very thorny problem space for type checkers.
I also want to emphasize that this isn’t some rare Django power-user functionality that is seldom used, nor is it considered deprecated or questionable—these computed fields are the core API of the Django ORM, so not only are they a moving target that changes with Django (and Django extension module) releases, but they’re also such a common kind of code that even minor errors in attempts to type-check them will be extremely visible and frustrating to a wide range of users.
None of that should be taken as an indictment of the Django ORM’s design (for the most part I find it quite good, and most of my major issues with it have little to do with type checking). Just trying to answer the question as directly as possible.
It's possible to write a Django typecheck shim using descriptors. There's some annoying stuff on the edges though, and for example if you are changing up fields in `__init__` then those aren't going to show up in your types.
Ultimately you can get typing for the usual cases, but it won't be complete because you can outright change the shape of your models in Django at runtime (actions that aren't type safe of course)
Isn't that largely on the shoulders of Django maintainers?
I’ve built a few typecheckers (in different languages) that hew closer to Pyrefly’s behaviour than Ty’s behaviour.
If you have a large codebase that you want to be typesafe, Pyrefly’s approach means writing far fewer type annotations overall, even if the initial lift is much steeper.
Ty effectively has noImplicitAny set to false.
As they're described here, Pyrefly's design choices make more sense to me. I like the way Typescript does type inference and it seems like Pyrefly is closer to that. Module-level incrementalism also seems like a good tradeoff. Fine-grained incrementalism on a function level seems like overkill. Performance should be good enough that it's not required.
I hope some typechecker starts doing serious supported notebook integration. And integration for live coding, not just a batch script to statically check your notebook. Finding errors with typing before running a 1-60 minute cell is a huge win.
Do you use Jupyter notebooks in VSCode? It uses the same pylance as regular python files, which actually gets annoying when I want to write throwaway code.
Anyone reading this, if you're like me and prefer the open source version of VSCode where Microsoft disables Pylance, I'd encourage you to try BasedPyright instead.
1 reply →
I echo the other response here. You absolutely should switch to using notebooks in VSCode with their static typenchecker. Language Servers do exactly what you are wanting, with both notebook integration and «live coding».
I think I prefer Pyrefly's stronger type inference. It can be a pain on Projects with a lot of dynamism, but I'll personally make the tradeoff
Nice, I'm using basedpyright right now, both as a type checker in my IDE as well as in GitHub actions. It's good and mainly does what I want.
I'm not very fond of mypy as it struggles even with simple typing at times.
for decades, big tech contributed relatively little in the way of python ecosystem tooling. There’s Facebooks Pyre, but that’s about it. Nothing for package/dependency management, linting, formatting, so folks like those at Astral have stepped up to fill the gap.
why is type checking the exception? with google and facebook and astral all writing their own mypy replacements, i’m curious why this space is suddenly so busy
Coming from a Meta background (not speaking on behalf of Meta):
"package/dependency management" - Everything is checked into a monorepo, and built with [Buck2](https://buck2.build/). There's tooling to import/update packages, but no need to reinvent pip or other package managers. Btw, Buck2 is pretty awesome and supports a ton of languages beyond python, but hasn't gotten a ton of traction outside of Meta.
"linting, formatting" - [Black](https://github.com/psf/black) and other public ecosystem tooling is great, no need to develop internally.
"why is type checking the exception" - Don't know about Astral, but for Meta / Google, most everyone else doesn't design for the scale of their monorepos. Meta moved from SVN to Git to Mercurial, then forked Mercurial into [Sapling](https://sapling-scm.com/) because simple operations were too slow for the number of files in their repo, and how frequently they receive diffs.
There are obvious safety benefits to type checking, but with how much Python code Meta has, mypy is not an option - it would take far too much time / memory to provide any value.
Instagram built a linter with the ability to fix errors which is an improvement over flake8 & pylint: https://github.com/Instagram/Fixit
But Ruff is an even greater improvement over that
Probably because a large amount of AIs are churning out Python code, and they need type-checkers to sanitize/validate that output quickly. Dynamic languages are hard enough for people to make sense of half the time, and I bet AI agents are struggling even more.
What was not immediately obvious to me (but should have been) is that these are dev-time type checkers -- I think. (I think, both from from the github descriptions which focus heavily on editing, and from the article.) That's really useful because type inference is lacking, to me, in-editor. I tend to ask Copilot: 'add type annotations'.
So in complement to this can I share my favorite _run-time_ type checker? Beartype: this reads your type annotations (ie I see this is where Pyrefly and Ty come in), and enforces the types at runtime. It is blazingly fast, as in, incredibly fast. I use it for all my deployed code.
https://beartype.readthedocs.io/en/latest/
I suspect either of Pyrefly or Ty will be highly complementary with Beartype in terms of editor additions, and then runtime requirements.
The docs also have a great sense of humour.
beartype is great, but I only find it useful at the edges. Runtime checks aren't needed if you have strict typing throughout the project. On a gradually-typed codebase, you can use beartype (e.g. is_bearable) to ensure the data you're ingesting has the proper type. I usually use it when I'm dealing with JSON types.
Why isn’t it necessary? Do you mean that with edit-time type checking, you can catch all errors, so no need for runtime verification the edit-time type decls match?
What about interacting with other libraries?
1 reply →
Any progress in the Python ecosystem on static checking for things like tensors and data frames? As I understand it, the comments at the bottom of this FAQ still apply:
https://docs.kidger.site/jaxtyping/faq/
I have a problem with Python's `Optional` type. For example for this following code:
Many type checkers throw an error because `Optional[int]` actually means `int | None` and you cannot square an `int` or a `float` with a `None`. Is there any plans for *ty* around this?
I think that's a genuine error, since as you say, `None` is a possible value for `b` according to your signature.
To handle this you would need to use "narrowing" to separately handle the case where `b` is `None`, and the case where it is not:
https://play.ty.dev/97fe4a09-d988-4cc3-9937-8822e292f8d1
(This is not specific to ty, that narrowing check should work in most Python type checkers.)
I'd like to experiment with writing a reflection based dynamic type annotator for one of these. Just as a toy. Imagine a module which monkey patches every function, class in the parent module it is loaded into, then reflects on the arguments at run time slowly building up a database of expected type signatures. The user would then pick through the deltas approving what made sense.
This is a neat idea, but for large systems it could take a long time to fully exercise every code path with all possible input data. (If this sounds way too excessive, that's because normal people don't do this.)
Unless you mean something like record prod for a few weeks to months, similar to how Netflix uses eBPF (except they run it all the time).
I think my main use case would be to run it on unit tests, and then to provide a human-in-the-loop way to slowly hydrate small to medium sized code bases with types. Maybe progressive refactor tools aren't something anyone actually wants to use or maintain, because I just don't see them around that much.
Are any of these already useful as LSPs for code editors such as Neovim? I run pyright in my Neovim config but I could certainly use something faster.
Not yet tbh. I would realistically expect at least a year or more before you can expect any kind of parity with eg; pyright/basedpyright. Type checkers are hard and have a long tail of functionality that must be implemented.
ty is definitely not ready to be a pyright replacement yet. But it is usable as an LSP for simple things like go to definition, and deeper LSP features are on the roadmap for the eventual beta and GA releases.
https://github.com/astral-sh/ty/blob/main/docs/README.md#oth...
What's the plan for ruff? Will it be part of ty one day?
1 reply →
What kind of codebases cause performance problems?
I've been using pyright with neovim for years and have never experienced any kind of noticeable lag.
Ty will have lower barrier of entry, similar to the early days of Typescript.
I'm curious to see which way the community will lean to.
it is sad to see how painful typed python is … i prefer a language where (gradual) types were designed infrom the get go … https://raku.org
As someone who has added type hints to two huge code bases that had nine. It's not that painful. Something much more painful is realizing the code base you are adding type hints to is irreconcilably bugged by design, which would not have been possible had type checking been used.