Comment by rixed

2 months ago

tagged unions (not enums, sorry) are not a dynamic type system concept. Actually, I would not be able to name a single dynamically typed language that has them.

As for the memory allocation, I can't see why any object should have the size of the largest alternative. When I do the manual equivalent of a tagged union in C (ie. a struct with a tag followed by a union) I malloc only the required size, and a function receiving a pointer to this object has better not assume any size before looking at the tag. Oh you mean when the object is automatically allocated on the stack, or stored in an array? Yes then, sure. But that's going to be small change if it's on the stack and for the array, well there is no way around it ; if it does not suit your design then have only the tags on the array?

Tagged unions are a thing, whether the language helps or not. When I program in a language that has them then it's probably a sizeable fraction of all the types I define. I believe they are fundamental to programming, and I'd prefer the language to help with syntax and some basic sanity checks; Like, with a dynamical sizeof that to reads the tag so it's easier to malloc the right amount, or a syntax that makes it impossible to access the wrong field (ie. any lightweight pattern matching will do).

In other words, I couldn't really figure out the downside you had in mind :)

2 comments

rixed

flohofwoe 2 months ago

> Actually, I would not be able to name a single dynamically typed language that has them.

That's because every type in a dynamically typed language is a tagged union ;) For instance in Javascript you need to inspect a variable with 'typeof' to find out if it is a string, a boolean, a number or something else.

In a dynamically typed language, the runtime system needs to carry information around what type an item actually is, and this is the same thing as the type-tag in a tagged union - and Rust's match is the same sort of runtime type inspection as the typeof in JS, just with slightly different syntax sugar.

> As for the memory allocation, I can't see why any object should have the size of the largest alternative.

When you have a Rust enum like this:

    enum Bla {
        AByte(u8),
        AString(String),
        AStruct{ x: i64, y: i64 },
    }

...then every Bla object is always at least 16 bytes even when the active item is 'AByte' (assuming an empty String also fits into 16 bytes). Plain unions in C have the same problem of course, but those are rarely used (the one thing where unions are really useful in C (not C++!) is to have different views on the same memory).

> When I program in a language that has them then it's probably a sizeable fraction of all the types I define

...IMHO 'almost always sum types' is a serious design smell, it might be ok in 'everything is a reference' languages like Typescript, but that's because you pay for the runtime overhead anyway, no matter if sum types are used or not.

rixed 2 months ago

I don't think we speak the same language. I was refering to the use case (in C) that's been described by sph. Where you indeed malloc only the relevant size, and you have to manually and carefully check the tag before casting the payload into the proper type. This is what I am tired of doing over and over and over again, and would like a system programing language to help with.