Does your language even need (complex) type inference?
Personally I am a bit skeptical about whether complex type inference doesn't do more harm than good in some cases. A valid alternative approach is to just make type declarations required. Sure infer trivial types like when doing variable assignment but other than that just expect types to be provided. This drastically cuts down on complexity and enforces more readable programs. And it gives you much better error messages.
Or go the opposite way: If you want a language that feels dynamic and leads to prototyping, well a type system that is total and complete might be too heavy. Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
Those are both very valid approach. Not every language needs that level of type inference.
> Or go the opposite way: If you want a language that feels dynamic and leads to prototyping, well a type system that is total and complete might be too heavy. Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
If you have generics and want type annotations to type check at compile time, you are going to need unification,
let l: List<Animal> = List(dog, cat)
At that point, you have written all the machinery to do inference anyway, so might as well use it.
I guess you could have a language where the above must be gradually typed like
let l: List<Animal> = List<Animal>(dog: Animal, cat: Animal)
HM is not complex type inference. In fact, among all the approaches you cite, it leads to the simplest type system and the simplest implementation. Moreover, there are lot's of courses, reference implementations, and reasonable extensions for a wide array of features (structural subtyping, qualified types, etc). There are even type-system libraries to make it easy to implement (like Inferno).
When new programmer discover ML-family language, they are often stunned by how lightweight it feels, because you don't have to annotate anything. If your types are not structurally too complicated and you want something really easy to use, HM is still the nicest experience.
Naturally, it's all a tradeoff, and if you want specific features (chiefly: borrows, high-order types, overloading, or Typescript-like features), you will need to abandon complete inference (and use something like bidirectional, most likely).
Ironically the language which needed this most was C++, which took ages to get the "auto" keyword and can still have a bit of a problem with fully expanded template names in error messages.
> Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
Yes, this is the way. And if you ensure that the type system is never abused to control dispatch - i.e. can be fully erased at runtime - then a proposed language supplied with some basic types and an annotation syntax can be retrofitted with progressively tighter type-checking algorithms without having to rewire that underlying language every time.
Developers could even choose whether they wanted super-safe checking (that precludes many legal forms), or more lenient checking (that allows many illegal forms), all in the same language!
> friends don’t just bring up type inference in casual conversation
I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"
But to the actual point of the article: my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable.
I once studied proof theory for a summer at a school in Paris and we talked about type inference and theorem proving all the time in casual conversation, over beers, in the park. It was glorious.
Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.
> Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.
An aside, but some years ago I watched the demo 1995 by Kewlers and mfx[1][2] for the first time and had a visceral reaction precisely due to that, thinking back to my teen years tinkering on my dad's computer, trying to figure out 3D rendering and other effects inspired by demos like Second Reality[3] or Dope[4].
I seldom become emotional but that 1995 demo really brought me back. It was a struggle, but the hours of carefree work brought the joy of figuring things out and getting it to work.
These days it's seldom I can immerse myself for hours upon hours in some pet project. So I just look things up on the internet. It just doesn't feel the same...
> I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"
It is!
> my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable
There are! Afaik most languages end up with a bidirectional system in practice for this reason. Haskell started out HM and has shifted to bidir because it interacts better with impredicative types (and visible type applications). Bidir can handle fancy features like subtyping and all sorts of nifty stuff.
The subject does sometimes come up in my casual conversations, since Robin Milner was my first CS lecturer.
He never actually spoke about type inference in my presence. He did teach me CCS (pi-calculus predecessor) a couple of years later, by which time I could appreciate him.
I have my own programming language (pretty advance), but I don't even know what these two typing approaches are. Is it problematic? Or I just have one of these two without knowing that?
I'm going to be contrarian: Yes, you should learn about type systems if you want to design a programming language, and decide in full conscience what you need. At the very least, it will give you a concrete idea of what safety means for a programming language.
It doesn't mean you have to use an advanced one, but your choice choose be based on knowledge, not ignorance.
A lot of harm; including the billion dollar mistake, has been done by badly designed type systems from the Java/C/C++ family.
You suggest that the programming language developers made a conscious choice to implement "messy" type systems. That's not at all how it came about. These messy type systems are generally a result of trying to work with earlier language design mistakes. Typescript is an obvious example. The type system was designed to able to type a reasonable subset of the mess of existing Javascript. There is no need to make the same mistakes in a new language.
I don't think it's more practical, being able to do things like type inference on return value is actually really cool. Maybe more practical for the programming language developer (less learning about type systems) than for the user.. but then you have to ask why build another language?
> What folks should actually be asking is “Does my language need generics?”.
You should also ask “Does my language need subtyping such as subclasses?” And if the answer to both is yes, you should probably forget about Hindley Milner, or at least pick something far away from it on the spectrum.
Structural subtyping yes, nominal subtyping is a bit pricklier.
As a developer I personally prefer structural subtyping, but structural subtyping is harder for a compiler to optimize runtime performance for.
Nominal sub-type hierarchies allows for members to be laid out linearly and member accesses becomes just an offset whereas a structural system always has the "diamond problem" to solve (it's hidden from users so not a "problem" but will still haunt compiler/runtime developers).
Now the kicker though, in practice nominal subtype polymorphism has other issues for performance on _modern_ computers since they create variable sized objects and cannot be packed linearly like monomorphic structures.
In the 90s when languages settled on nominal typing memory speeds weren't really an huge issue, but today we know that we should rather compose data to achieve data-polymorphic effects and singular types should be directed to packing.
Thus, most performance benefits of a nominal type system over a structural don't help much in real-life code and maintenance wise we would probably have been better off using structural types (iirc Go went there and interfaces in Java/C# achieves mostly the same effect in practice).
I've been implementing row polymorphism in my fork of Elm in order to support proper sum and substraction operations on unions, and it's far from trivial.
Example usecase: an Effect may fail with 2 types, but actually you have handled one/catched one so you want to remove it.
Elm-like HM systems handle fine, as you say it, row polymorphism mostly over records.
I'm not an expert in all of this, started studying this recently, so take my words with a grain of salt.
Eh, generics kinda do introduce a subtyping relation already. It's just that HM's Gen rule of e: σ implies e: ∀α.σ is restrictive enough that this subtyping relation can be checked (and inferred) by using just unification which is quite an amazing result.
The real question is unification vs bidir more than HM vs bidir.
Unification is simple, not very hard to implement and more powerful.
Bidir gives better error messages and is more "predictable".
I personnaly lean strongly towards unification. I think you can get good enough error messages and what you lose with bidir is not worse it. But clearly the Rust core team disagreed. They clearly don't mind annotations.
Does your language even need (complex) type inference?
Personally I am a bit skeptical about whether complex type inference doesn't do more harm than good in some cases. A valid alternative approach is to just make type declarations required. Sure infer trivial types like when doing variable assignment but other than that just expect types to be provided. This drastically cuts down on complexity and enforces more readable programs. And it gives you much better error messages.
Or go the opposite way: If you want a language that feels dynamic and leads to prototyping, well a type system that is total and complete might be too heavy. Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
Those are both very valid approach. Not every language needs that level of type inference.
> Or go the opposite way: If you want a language that feels dynamic and leads to prototyping, well a type system that is total and complete might be too heavy. Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
If you have generics and want type annotations to type check at compile time, you are going to need unification,
let l: List<Animal> = List(dog, cat)
At that point, you have written all the machinery to do inference anyway, so might as well use it.
I guess you could have a language where the above must be gradually typed like
let l: List<Animal> = List<Animal>(dog: Animal, cat: Animal)
but that doesn't seem particularly ergonomic.
HM is not complex type inference. In fact, among all the approaches you cite, it leads to the simplest type system and the simplest implementation. Moreover, there are lot's of courses, reference implementations, and reasonable extensions for a wide array of features (structural subtyping, qualified types, etc). There are even type-system libraries to make it easy to implement (like Inferno).
When new programmer discover ML-family language, they are often stunned by how lightweight it feels, because you don't have to annotate anything. If your types are not structurally too complicated and you want something really easy to use, HM is still the nicest experience.
Naturally, it's all a tradeoff, and if you want specific features (chiefly: borrows, high-order types, overloading, or Typescript-like features), you will need to abandon complete inference (and use something like bidirectional, most likely).
Type inference saves typing on the keyboard.
Ironically the language which needed this most was C++, which took ages to get the "auto" keyword and can still have a bit of a problem with fully expanded template names in error messages.
> Instead of only allowing programs that are proven to be typed correctly you might want to allow all programs that you can not proved to be wrong. Lean into gradual typing. Everything goes at first and the typing becomes as strict as the programmer decides based on how much type information they add.
Yes, this is the way. And if you ensure that the type system is never abused to control dispatch - i.e. can be fully erased at runtime - then a proposed language supplied with some basic types and an annotation syntax can be retrofitted with progressively tighter type-checking algorithms without having to rewire that underlying language every time.
Developers could even choose whether they wanted super-safe checking (that precludes many legal forms), or more lenient checking (that allows many illegal forms), all in the same language!
> friends don’t just bring up type inference in casual conversation
I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"
But to the actual point of the article: my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable.
I once studied proof theory for a summer at a school in Paris and we talked about type inference and theorem proving all the time in casual conversation, over beers, in the park. It was glorious.
Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.
> Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.
An aside, but some years ago I watched the demo 1995 by Kewlers and mfx[1][2] for the first time and had a visceral reaction precisely due to that, thinking back to my teen years tinkering on my dad's computer, trying to figure out 3D rendering and other effects inspired by demos like Second Reality[3] or Dope[4].
I seldom become emotional but that 1995 demo really brought me back. It was a struggle, but the hours of carefree work brought the joy of figuring things out and getting it to work.
These days it's seldom I can immerse myself for hours upon hours in some pet project. So I just look things up on the internet. It just doesn't feel the same...
[1]: https://www.youtube.com/watch?v=weGYilwd1YI
[2]: https://www.pouet.net/prod.php?which=25783
[3]: https://www.pouet.net/prod.php?which=63
[4]: https://www.pouet.net/prod.php?which=37
Join us in ##dependent on Libera IRC. We continue to talk about this stuff all the time, with a focus on Martin-Löf intuitionistic type theory.
> I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"
It is!
> my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable
There are! Afaik most languages end up with a bidirectional system in practice for this reason. Haskell started out HM and has shifted to bidir because it interacts better with impredicative types (and visible type applications). Bidir can handle fancy features like subtyping and all sorts of nifty stuff.
The subject does sometimes come up in my casual conversations, since Robin Milner was my first CS lecturer.
He never actually spoke about type inference in my presence. He did teach me CCS (pi-calculus predecessor) a couple of years later, by which time I could appreciate him.
If your type system is HM, consider a compositional type system instead, for much better explainability of type derivations and type errors: https://unsafePerform.IO/projects/talks/2016-06-compty/CompT...
that is the best use of non canonical domain name capitalization I've ever seen.
I have my own programming language (pretty advance), but I don't even know what these two typing approaches are. Is it problematic? Or I just have one of these two without knowing that?
I'm going to be contrarian: Yes, you should learn about type systems if you want to design a programming language, and decide in full conscience what you need. At the very least, it will give you a concrete idea of what safety means for a programming language.
It doesn't mean you have to use an advanced one, but your choice choose be based on knowledge, not ignorance.
A lot of harm; including the billion dollar mistake, has been done by badly designed type systems from the Java/C/C++ family.
Probably not. Most popular programming languages have messy - unsound and/or undecidable - type systems e.g. C++, C#, TypeScript, Java,..
..because that is more practical.
You suggest that the programming language developers made a conscious choice to implement "messy" type systems. That's not at all how it came about. These messy type systems are generally a result of trying to work with earlier language design mistakes. Typescript is an obvious example. The type system was designed to able to type a reasonable subset of the mess of existing Javascript. There is no need to make the same mistakes in a new language.
I don't think it's more practical, being able to do things like type inference on return value is actually really cool. Maybe more practical for the programming language developer (less learning about type systems) than for the user.. but then you have to ask why build another language?
Can you make an example of TypeScript's unsoundness that cannot be fixed with better encodings?
> What folks should actually be asking is “Does my language need generics?”.
You should also ask “Does my language need subtyping such as subclasses?” And if the answer to both is yes, you should probably forget about Hindley Milner, or at least pick something far away from it on the spectrum.
In this case do yourself a favor and use something like simple-sub
https://github.com/LPTK/simple-sub
https://www.reddit.com/r/ProgrammingLanguages/comments/hpi54...
https://dl.acm.org/doi/10.1145/3409006
HM handles sub-typing just fine? Numerous approaches have been known since the 1980s - Michael Wand’s row polymorphism is one such approach.
https://en.wikipedia.org/wiki/Row_polymorphism
Structural subtyping yes, nominal subtyping is a bit pricklier.
As a developer I personally prefer structural subtyping, but structural subtyping is harder for a compiler to optimize runtime performance for.
Nominal sub-type hierarchies allows for members to be laid out linearly and member accesses becomes just an offset whereas a structural system always has the "diamond problem" to solve (it's hidden from users so not a "problem" but will still haunt compiler/runtime developers).
Now the kicker though, in practice nominal subtype polymorphism has other issues for performance on _modern_ computers since they create variable sized objects and cannot be packed linearly like monomorphic structures.
In the 90s when languages settled on nominal typing memory speeds weren't really an huge issue, but today we know that we should rather compose data to achieve data-polymorphic effects and singular types should be directed to packing.
Thus, most performance benefits of a nominal type system over a structural don't help much in real-life code and maintenance wise we would probably have been better off using structural types (iirc Go went there and interfaces in Java/C# achieves mostly the same effect in practice).
I've been implementing row polymorphism in my fork of Elm in order to support proper sum and substraction operations on unions, and it's far from trivial.
Example usecase: an Effect may fail with 2 types, but actually you have handled one/catched one so you want to remove it.
Elm-like HM systems handle fine, as you say it, row polymorphism mostly over records.
I'm not an expert in all of this, started studying this recently, so take my words with a grain of salt.
*Mitchell Wand
Eh, generics kinda do introduce a subtyping relation already. It's just that HM's Gen rule of e: σ implies e: ∀α.σ is restrictive enough that this subtyping relation can be checked (and inferred) by using just unification which is quite an amazing result.
The real question is unification vs bidir more than HM vs bidir.
Unification is simple, not very hard to implement and more powerful. Bidir gives better error messages and is more "predictable".
I personnaly lean strongly towards unification. I think you can get good enough error messages and what you lose with bidir is not worse it. But clearly the Rust core team disagreed. They clearly don't mind annotations.
Anyway, here is my non answer: it's a trade off.