Comment by lacker
9 days ago
Dependent types are very useful for some things. For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type, and typecheck that.
The Curry-Howard correspondence, using dependent type system to have "proofs" be equivalent to "types", is powerful, but it can be really confusing. From a human point of view, there is a huge difference between "Your proof is wrong" and "You wrote a statement that fails typechecking".
Intuitively, when you make an error with types, it should be something fairly trivial that you just read the error and fix it up. When you make an error in a proof, it's understandable if it's very complicated and requires thought to fix. The natural UI is different.
So I agree with the author that the greatest benefit of Lean is not its typesystem per se, but its community. Specifically the fact that Lean's library of mathematics, mathlib, is organized like an open source community with pull requests. Whereas Isabelle's library of mathematics, the AFP, is organized like a scientific journal with referees.
I'm working on a dependent type system at the moment for a new theorem prover - Acorn, at https://acornprover.org - and my hope is to combine the good points of both Lean and Isabelle. It's nice that Lean has the power to cleanly express the simple dependent types that mathematicians often use, like vector spaces or quotients. But if you use dependent types too much then it does get complicated to debug what's happening.
> For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type
To clarify, as long as 5 and 10 are constants, this is entirely possible in C++ and Rust^1, neither of which are dependently typed (or at most are dependently typed in a very weak sense). In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known. A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.
^1 And weakly in many other languages whose builtin array types have compile-time bounds. But C++ and Rust let user-defined generic types abstract over constant values.
You can do that in python using https://github.com/patrick-kidger/torchtyping
looks like this:
There's also https://github.com/thomasahle/tensorgrad which uses sympy for "axis" dimension variables:
Quick heads-up that these days I recommend https://github.com/patrick-kidger/jaxtyping over the older repository you've linked there.
I learnt a lot the first time around, so the newer one is much better :)
1 reply →
Is there a mypy plugin or other tool to check this via static analysis before runtime? To my knowledge jaxtyping can only be checked at runtime.
1 reply →
> In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known.
Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types. It's fully possible to define a matrix that uses a generic type as the index for each dimension that doesn't allow expressing values outside the expected range; it would just be fairly cumbersome, and the usual issues would creep back in if you needed to go from "normal" integers back to indexes (although not if you only needed to convert the indexes to normal integers).
I find that the potential utility of dependent types is more clear when thinking about types where the "dimensions" are mutable, which isn't usually how I'd expect most people to use matrixes. Even a simple example like "the current length of a list can be part of the type, so you define a method to get the first element only on non-empty lists rather than needing them to return an optional value". While you could sort of implement this in a similar as described above with a custom integer-like type, the limitations of this kind of approach for a theoretically unbounded length are a lot more apparent than a matrix with constant-sized dimensions.
The advantage of dependent typing systems is that you can say that the arguments to a matrix multiplication function must have dimensions (m, i) and (i, n) and that result will have dimensions (m, n).
9 replies →
Yes, the point of dependent types is that they give you the ability to do some sort of almost arbitrary (though not strictly Turing-complete) "computation" as part of type checking, which essentially dispenses with the phase separation between "compiling" and "running" code - or at least makes compile-time computation unusually powerful. So if you want to replicate the properties of dependent typing in these existing languages you'll need to leverage their existing facilities for compile-time metaprogramming.
> A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.
Yes, but the way this is done is by threading a proof "this index is within bounds" throughout the code. At runtime (e.g. within 'extracted' code, if you're using common dependently-typed systems), this simply amounts to relying on a kind of capability or ghost 'token' that attests to the validity of that code. You "manufacture" the capability as part of an explicit runtime check when needed (e.g. if the "index" or "bounds" come from user input) and simply rely on it as part of the code.
You can do arbitrary computations as part of type checking in C++, yet I don't think it should be considered dependently typed.
It seems to me that dependent typing strictly requires going from runtime values to types.
(You can parametrize types by runtime values in c++ in a trivial sense, by enumerating a finite set a compile time and then picking the correct type at runtime according to a runtime value, still I don't think it counts as the set of valid types would be finite).
8 replies →
The latter is what would be most useful imo. Even Matlab can type check matrix sizes with constants these days, but I often wish I could use variables to express relationships between the sizes of different dimensions of inputs to a function.
That's a good point, for example in Eigen you can do
Eigen::Matrix<float, 10, 5>
I just really want it in Python because that's where I do most of my matrix manipulation nowadays. I guess you also would really like it to handle non-constants. It would be nice if these complicated library functions like
torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None)
would actually typecheck that kdim and vdim are correct, and ensure that I correctly pass a K x V matrix and not a V x K one.
Python is a dynamic language where everything happens at run time, including checks of variable types. So you can already do this.
4 replies →
In Common Lisp:
And the type check will return true.
Compile-time checks is what's implied in virtually all these conversations.
Does that check run at compile time?
4 replies →
C has such types and can guarantee that there is no out-of-bounds access at run-time in the scenarios you describe: https://godbolt.org/z/f7Tz7EvfE This is one reason why I think that C - despite all the naysayers - is actually perfectly positioned to address bounds-safe programming.
Often in dependently-types languages one also tries to prove at compile-time that the dynamic index is inside the dynamic bound at run-time, but this depends.
-fsanitize-bounds uses a runtime address sanitizer, surely? The program compiles fine. In a (strongly) dependently typed language, something like the following would refuse to typecheck:
The type checker would demand a proof that i is in bounds, for example
In languages with an Option type this could of course be written without dependent types in a way that's still correct by construction, for example Rust:
But ultimately, memory safety here is only guaranteed by the library, not by the type system.
2 replies →
I thought that this was a GCC extension(you need to use #define n 10 instead of int n = 10). Is this not the case anymore?
1 reply →
"In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known."
I remember being able to use telescopes [1] in Haskell long time ago, around 2012 or so.
[1] https://www.pls-lab.org/en/telescope
Haskell was not and is not properly dependently typed.
> using dependent type system to have "proofs" be equivalent to "types"
Do you mean proposition as types? And proof has a program?
Or do you see it at somewhat a higher order, since values are proof for a type, and types can perhaps be proof at a higher level if they are first class citizen in the language?
(Which makes me think that dependent types is really a stairway between two type systems?)
Just wondering, since I had never personally seen it described in that fashion.
Other question: What is the function and runtime treatment of dependent types? Can they be instantiated on the spot by runtime values? A bit like defining/declaring functions that return arrays of runtime-known length? Does it involve two aspects of type checking? compile and runtime? Implementation-wise, doesn't it require too many indirections?
Yeah, I just meant proposition-as-types. Sloppy terminology by me. I meant to complain about the general conflation of the "proving system" and the "typechecking system".
Generally in dependent type systems you are limited in some way in what sort of values a type can refer to. The fancy type checking happens at compile time, and then at run time some or most of the type information is erased. The implementation is tricky but the more difficult problem IMO is the developer experience; it involves a lot of extra work to express dependent types and help the compiler verify them.
One of the things I love about typescript is if your audacious enough you can kind of make some types like this (still a far cry from dependent types):
https://github.com/AKST/analysis-notebook/blob/main/lib/base... (Line 38)
It type checks, but I haven’t gotten a lot of mileage out of this (so far at least). I did something similar for vectors, which has been useful for when I destructure the elements, and it preserves that through vector operations.
https://github.com/AKST/analysis-notebook/blob/777acf427c65c...
The same is actually applies true for matrices, I wrote the multiplication to carry the new size into the output type
https://github.com/AKST/analysis-notebook/blob/777acf427c65c...
That said I mostly use this for square matrixes in web GL so you’re mostly working with square matrixes.
As an aside, this source is from a toy project largely more for leisure and fun, I don’t know if something like this would be suitable for a larger more data oriented project (at least in trend of how I represent data here).
As other commenters have said, this doesn't need dependent types. Rust's const generics / C++'s non-type template parameters are enough.
Even in languages with dependent types, people tend to find it overly cumbersome to use them for actually prevent runtime errors like out-of-bounds access; they instead tend to save the dependent types for (1) data structure invariants that can be propagated easily, and (2) verification after the program has been defined.[0][1]
[0]: https://xenaproject.wordpress.com/2020/07/05/division-by-zer...
[1]: https://www.cs.ox.ac.uk/ralf.hinze/WG2.8/26/slides/xavier.pd... ("Programming with dependent types: passing fad or useful tool?" by Xavier Leroy in 2009; things might have changed since 2009)
> Dependent types are very useful for some things. For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type, and typecheck that.
What do you mean? Every class is a type, so create your own class and control what goes in.
Type checking is done statically without running the program. You don't need to execute any run-time logic to perform a check like this. What you are suggesting is a much, much weaker form of verification that doesn't require any type system at all.
I did a small small experiment in dependent typing for Python (and catching those errors before run time). It's a decently good fit, and with HM Type inference some neat things just fall out of the design.
But the real challenge is deciding how much of Python's semantics to preserve. Once you start enforcing constraints that depend on values, a lot of idiomatic dynamic behavior stops making sense. Curious what ways there are to work around this, but I think that's for a person better at Python than I :)
> For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type, and typecheck that.
Numpy arrays have been checkable by dtype for a little while now, and I think recent versions also support shape constraints.
Can these constraints be statically checked with mypy or a language server?
The dtype constraint can with pyright (and presumably others). We're still on older versions of numpy so I don't have first hand knowledge of the shape constraints.