← Back to context

Comment by miki123211

10 hours ago

This isn't specific to Rust or Typescript. You can do this in basically any language.

Imagine you have to distinguish between unescaped and escaped strings for security purposes. Even with a dynamically typed language, you can keep escaped strings as an Escaped class, with escape(str)->Escaped and dangerouslyAssumeEscaped(str)->Escaped functions (or static methods). There's a performance cost to this, so that's a tradeoff you have to weigh, but it is possible.

Another way of doing this is Application Hungarian[1], though that relies on the programmer more than it does on the compiler.

[1] https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...

> There's a performance cost to this

That part is (de facto) required for dynamically typed languages, but not for statically typed ones where the newtype constructor/deconstructor can be elided at compile time. Rust and C++ especially both do the latter by having true value types available for wrappers that evaporate into zero extra machine code.

But then just this moment I wondered: do any major runtimes using models with no static type info manage to do full newtype elision in the JIT and only box on the deopt path? What about for models with some static type info but no value types, like Java? (Java's model would imply trickiness around mutability, but it might be possible to detect the easy cases still.) I don't remember any, but it could've shown up when I wasn't looking.

  • Well, java can do escape analysis, so a wrapper with a single field may end up as a local variable of the embedded field.

    As for other JVM languages like Kotlin and Scala, they have basically what "newtype" is, but it can only be completely erased in the byte code when they have a single field.

    • Escape analysis that sinks a local allocation is great in itself, but for newtypes for things like “trusted HTML vs plain text”, I feel like the primary benefits are deeply interprocedural. The type constraint is encoding a promise that can be carried from one end of the code base to the other, and where you can know for sure when you're writing a module whether you're on one side of a barrier or the other. I would tend to expect this to result in patterns that aren't well-handled by escape analysis.

      What I'm imagining for my curiosity about the dynamic case would look more like “JS/Lua/whatever engine detects that in frob(x) calls, x is always shaped like { foo: ‹string› } and its object identity is unused, so it replaces the calling convention for frob internally, then propagates that to any further callers”, and it might do the same thing when storing one of those in fields of other objects of known shapes, etc. until eventually it hits a boundary where the constraint isn't known to hold and has to be ready to materialize the wrapper object there.

      Kotlin and Scala sound like they're doing the Rust/C++ thing at the bytecode level, if it's being “erased”, so just the static case again but with different concrete levels for machine vs language.

> This isn't specific to Rust or Typescript. You can do this in basically any language.

This just isn't true.

In any dynamic language you would not get these guarantees at compile time. You'd get random failures at runtime. That's not safety of any kind.

Also, part of the goal of languages like Haskell is that they help you think about your code before it runs. All of that is lost.

> Imagine you have to distinguish between unescaped and escaped strings for security purposes

That would be a nightmare in many languages. You'd have to rewrite large parts of the code to be compatible with one or both. And in many languages you'd have to duplicate your code entirely.

In other languages, the result would so ugly, you would never want to touch that code. Imagine doing this with say, templates in C++.

>There's a performance cost to this

There is no performance cost in Haskell! This is entirely undone by the compiler.

Also, because the compiler understands what's going on at a much higher level, you can do things like deriving code. You can say that your classified strings behave like your regular strings in most contexts, like say, they're the same for the purpose of printing but not for the purpose of equality, in one line.

What you cannot do is compile-time safety guarantees, and in languages like Rust type system isn't strong enough to do some advanced compile-time guarantees (via types). So no, you cannot do this in basically any language (unless you turn it into Haskell).

> You can do this in basically any language.

You can do it in Assembly. That doesn't mean it's cost effective.

  • And categorically: the issue isn’t what “I’d” do, my habits often match my habits, it’s what other project members will be doing (including future degenerate versions of myself assumed to be some combination of busy, tired, stressed and drunk).

    The Confucian philosophy that people act like water coming down a mountain, seeking the path of least resistance comes to play.

    Haskell, OCaml, F#, and their ilk can yield beautiful natural domain languages where using the types wrong is cost prohibitive. In languages without those guarantees every developer needs discipline to avoid shortcuts, and review needs increase, and time-pressure discussions rehashed.