Dependent types and how to get rid of them
7 days ago (chadnauseam.com)
Recent and related: Why don't you use dependent types? - https://news.ycombinator.com/item?id=45790827 (116 comments)
7 days ago (chadnauseam.com)
Recent and related: Why don't you use dependent types? - https://news.ycombinator.com/item?id=45790827 (116 comments)
I read the first post and thought someone should at the very least post the lambda cube. This isn't my area of expertise, since I've done very little with dependent types (staying firmly on the left side of the cube), but it outlines some useful categories of type systems here.
https://en.wikipedia.org/wiki/Lambda_cube
I wrote a small post on that: https://azdavis.net/posts/lambda-cube/
Hope it’s helpful!
Cool that you worked on an implementation of the CoC.
I've been somewhat wanting to go back and revisit ATAPL and read chapter 2 on this subject: https://www.cis.upenn.edu/~bcpierce/attapl/frontmatter.pdf
That is a good post. I've linked to it from mine!
2 replies →
I see we need to add special syntax to the signature for dependent type variables.
If you take Zig, it's `comptime` parameters are kind of similar. They can be used to create functions that return types or whose output type depends on the inputs, etc. It seems to fulfil the three axioms at the start, no? The erasure stuff seems just as relavant.
Can I say that `comptime` is dependent types in imperative clothing? Or is there a missing capability making it strictly weaker?
I'm not familiar with Zig, but I'm pretty sure comptime is not quite as powerful.
In the post, I didn't give any examples of situations where the input to a dependent function was not known at compile time. However, that was just due to the lack of time when writing the post.
I gave this example:
The difference is that nothing in dependent types requires that the depended-on value `b` is known at compile time. You could pass any Bool to `getValue`, including one that was read as input from the user. It would just be the case that, before using the value returned by `getValue b`, you'd have to check (at runtime) the value of `b`. Only then would you know `pickType b`, which would tell you the type of `getValue b`. My apologies that the post was unclear about this. (I've just updated it.)
If you don’t know at compile time what the type of the value is that is returned, how do you know whether the remainder of the expression is still correct?
The choice is either to have a total function that will be able to deal with every type and check at compile time whether they exist or to defer to the runtime and then throw an exception if the program doesn’t know what to do with the value.
> Instead of writing the type directly, we're calling pickType and using the type returned. The typechecker evaluates pickType True at compile time, sees it equals Nat, and then it knows you're saying myNat has type Nat.
But how does the compiler evaluate `pickType True` at compile time? Does the typechecker contain an interpreter for (some subset of) the language? It seems potentially circular for a compiler to invoke functions defined in the code that it's compiling.
Yes, the compiler contains an interpreter for the entire language, not just a subset. It is circular as you rightly point out, but there's no contradiction here.
Typically the interpreter limits itself to only evaluating pure functions or functions that do not perform any kind of IO, but in principle this restriction is not necessary. For example Jai's compiler allows completely arbitrary execution at compile time including even downloading arbitrary data from the Internet and using that data to make decisions about how to compile code.
Thank you for the explanation. This reminds me somewhat of an F# type provider, which can execute arbitrary code at compile time.
https://learn.microsoft.com/en-us/dotnet/fsharp/tutorials/ty...
I am slowly trying to understand dependent types but the explanation is a bit confusing to me as, I understand the mathematical terminology of a function that may return a type, but... Since function types take a value and return a value, they are by definition in another universe from say morphisms that would take a type and return a type.
The same way, I see a value as a ur-element and types as sets of values. So even if there is syntactic sugar around the value <-> type equivalence, I'd naively think that we could instead define some type morphism and that might be more accurate. The value parameter would merely be declared through a type parameter constrained to be a singleton. The same way a ur-element is not a set but a member of set.
Then the question is representation but that could be left as an optimization. Perhaps that it is already what is done.
Example:
type Nine int = {9} And then the rest is generic functions, parameterizable by 9, or actually, Nine.
So nothing too different from a refinement of int.
Basically, 'value' would be a special constraint on a type parameter in normal parametric polymorphism implementations. There would probably be derived constraint information such as size etc...
But I guess, the same issue of "which refinement types can be defined, while keeping everything decidable" remains as an issue.
Also how to handle runtime values? That will require type assertions, just like union types? Or is it only a compile time concept and there is no runtime instantiations. Only some kind of const generics?
A typeof function could be an example of dependent type though? Even at runtime?
Just wondering...
In type theory, all singleton types are isomorphic and have no useful distinguishing characteristics (indeed, this is true of all types of the same cardinality – and even then, comparing cardinalities is always undecidable and thus irrelevant at runtime). So your Nine type doesn’t really make sense, because you may as well just write Unit. In general, there is no amount of introspection into the “internal structure” of a type offered; even though parametricity does not hold in general (unless you postulate anticlassical axioms), all your functions that can run at runtime are required to be parametric.
Being isomorphic is not the same as being identical, or substitutable for one another. Type theory generally distinguishes between isomorphism and definitional equality and only the latter allows literal substitution. So a Nine type with a single constructor is indeed isomorphic to Unit but it's not the same type, it carries different syntactic and semantic meaning and the type system preserves that.
Some other false claims are that type theory does not distinguish types of the same cardinality. Type theory is usually intensional not extensional so two types with the same number of inhabitants can have wildly different structures and this structure can be used in pattern matching and type inference. Cardinality is a set-theoretic notion but most type theories are constructive and syntactic, not purely set-theoretic.
Also parametricity is a property of polymorphic functions, not of all functions in general. It's true that polymorphic code can't depend on the specific structure of its type argument but most type theories don't enforce full parametricity at runtime. Many practical type systems (like Haskell with type classes) break it with ad-hoc polymorphism or runtime types.
How does it become Unit if it is an integer of value 9? Why would cardinalities be undecidable if they are encoded discretely in the type?
For instance, type Nine int = {9} would not be represented as 9. It would probably be a fat pointer. It is not just an int, it would not even have the same operations (9 + 9 is 18 which is an int but is not of type Nine, but that's fine, a fat pointer does not need to share the same set of operations as the value it wraps).
It could be seen as a refinement of int? I am not sure that it can truly be isomorphic? My suspicion was that it would only be somewhat isomorphic at compile time, for type checking, and if there is a mechanism for auto unwrapping of the value?
In the style of the linked post, you'd probably define a generic type (well, one of two generic types):
type ExactlyStatic : (0 t: Type) -> (0 v: t) -> Type
type ExactlyRuntime : (0 t: Type) -> (v: t) -> Type
Then you could have the type (ExactlyStatic Int 9) or the type (ExactlyRuntime Int 9).
The difference is that ExactlyStatic Int 9 doesn't expose the value 9 to the runtime, so it would be fully erased, while (ExactlyRuntime Int 9) does.
This means that the runtime representation of the first would be (), and the second would be Int.
> Also how to handle runtime values? That will require type assertions, just like union types?
The compiler doesn't insert any kind of runtime checks that you aren't writing in your code. The difference is that now when you write e.g. `if condition(x) then ABC else DEF` inside of the two expressions, you can obtain a proof that condition(x) is true/false, and propagate this.
Value representations will typically be algebraic-data-type flavored (so, often a tagged union) but you can use erasure to get more efficient representations if needed.
Hi aatd86! We had a different thread about existence a couple days ago, and I'm just curious -- is English your second language? What's your first language? I'm guessing French, or something from Central Europe.
Thanks for humoring me!
Functions that return types are indeed at a higher level than those that don't.
Values can be seen as Singleton types. The key difference is the universe they live in. In the universe of types, the level one level below is perceived as a value. Similarly, in the universe of kinds, types appear to be values.
> Basically, 'value' would be a special constraint on a type parameter in normal parametric polymorphism implementations
Yes this is a level constraint.
> But I guess, the same issue of "which refinement types can be defined, while keeping everything decidable" remains as an issue.
If you're dealing with fully computable types. Nothing is decidable.
> Also how to handle runtime values? That will require type assertions, just like union types? Or is it only a compile time concept and there is no runtime instantiations. Only some kind of const generics?
A compiler with dependent types is essentially producing programs that are itself embedded with its input. There cannot be a distinction between runtime and compel time. Because in general type checking requires you to be able to run a program. The compilation essentially becomes deciding which parts you want to evaluate and which parts you want to defer until later.
> A typeof function could be an example of dependent type though? Even at runtime?
Typeof is just const.
Typeof: (T : type) -> (x:T) -> Type
Final note: I've written some compilers for toy dependently typed languages. By far dependent typing makes both the language and the compiler easier not harder. This is because Haskell and c++ and other languages with type systems and metaprogramming or generics actually have several languages: the value language that we are familiar with, but also the type language.
In Haskell, this is the class/instance language which is another logic programming language atop the lambda calculus language. This means to write a Haskell compiler you have to write a compiler for Haskell and the logic programming language (which is turing complete btw) that is the type language
Similarly in c++, you have to implement c++ AND the template system which is a functional programming language with incredibly confusing semantics.
On the other hand for a dependently typed language, you just write one language... The types are talked about in the same way as values. The only difficulty is type inference. However, an explicit dependently typed language is actually the easiest typed compiler to write because it's literally just checking for term equality which is very easy! On the other hand with a dependent language, type inference is never going to be perfect so you're a lot freer to make your own heuristic decisions and forego HM-like perfection.
This made me realize TypeScript’s conditional types are closer to dependent types than I thought.
Yes. You can give pickType a type in TS. Something like pickType<B extends boolean>(b: B): B extends true ? string : number.
I’d love to read a post explaining how TS conditional types are or are not a form of dependent types. Or, I’d like to understand dependent types well enough to write that post.
"But we'd like to give a type to this function. In most languages (Haskell, Typescript, C++, Rust, C#, etc.) there is no way to do that.[2] "
Can't you do this with a GADT? With a GADT, the type of the output can be specified by which branch of the GADT the input matches. Seems quite a similar idea.
GADTs let you do similar things in the simplest cases, they just don't work as well for complex constraints.
Like, append: Vector n t -> Vector m t -> Vector (n+m) t. With GADTs you need to manually lift (+) to work on the types, and then provide proofs that it's compatible with the value-level.
Then you often need singletons for the cases that are more complex- if I have a set of integers, and an int x, and I want to constrain that the item is in the set and is also even, the GADT approach looks like:
doThing :: Singleton Int x -> Singleton (Set Int) s -> IsInSet x s -> IsEven x -> SomeOutput
Which looks reasonable, until you remember that the Singleton type has to have a representation which is convenient to prove things about, not a representation that is convenient/efficient to operate on, which means your proofs will be annoying or your function itself will hardly resemble a "naive" untyped implementation.
Dependent types fix this because your values are "just" values, so it would be something like
doThing :: (x : int) -> (s : set int) -> so (is_in s x) -> so (even x) -> SomeOutput
Here, x and s are just plain values, not a weird wrapper, and is_in and event are ordinary (and arbitrary) library functions.
I mean, you can trivially do it with a (non-G) ADT, if you're content with wrapping the return value.
OTOH you can do this in TypeScript easily without wrapping; it could return a union type `number | string`.
Off the topic of dependent types, but this is way too hilarious to pass up..
> Imagine if pickType did a network request to a server and then returned whatever type the server said. Sounds like a bad idea!
Fucking .. isn't this what LSPs are?! Finally something functional programming people and I can agree on. That does sound like a bad idea.
I think it's a bit apples and oranges. I was suggesting that compilation itself should probably not involve nondeterministic network requests. If I understand LSP correctly, it only uses network requests to allow your editor to communicate with a binary that provides type information. But the compiler behavior is unchanged. Honestly LSPs seem pretty reasonable to me.
What's the fundamental difference between a syntax highlighter and the frontend of a compiler? I would argue not much, apart from the fact that people are willing to have their syntax highlighter randomly explode, but are much more concerned about their compiler randomly exploding.
7 replies →
No, LSPs return the name/metadata of a concrete type. Dependent typing means that the return type of any given function in your (static) program can depend on a runtime value, e.g. user input... In well-defined ways ofc.
So, you're saying it's outside the scope of an LSP to return information about a dependent type because it's .. not a concrete type? That sounds wrong.
I can make literally any language support dependent types that have struct, enum, switch, and assert. You make a boxed type (tagged union, algebraic datatype, variant, discriminated union, fucking, whatever), and just assert whenever you pack/unpack it on function entry/exit. I do this all the time.
In plain English, my quip boils down to 'why do we tolerate network requests in our syntax highlighters, when we don't tolerate them in our compiler frontends?'