Comment by OkayPhysicist

6 months ago

An interesting problem I've played around with fair bit is the idea of a maximally expressable non-Turing complete language, trying to make a language that is at least somewhat comfortable to use for many tasks, while still being able to make static assertions about runtime behavior.

The best I've managed is a functional language that allows for map, filter, and reduce, but forbids recursion or any other looping or infinite expansion in usercode.

The pitch is that this kind of language could be useful in contexts where you're executing arbitrary code provided by a potentially malicious third party.

5 comments

OkayPhysicist

ameliaquining 6 months ago

I think you're asking for Starlark (https://starlark-lang.org), a language that strongly resembles Python but isn't Turing-complete, originally designed at Google for use in their build system. There's also Dhall (https://dhall-lang.org), which targets configuration use cases; I'm less familiar with it.

One problem is that, while non-Turing-completeness can be helpful for maintainability, it's not really sufficient for security. Starlark programs can still consume exponential amounts of time and memory, so if you run an adversary's Starlark program without sandboxing it, you're just as vulnerable to denial-of-service attacks as you'd be with a Turing-complete language. The most common solution is sandboxing, wherein you terminate the program if it exceeds time or memory limits; however, once you have that, it's no longer necessary for the language to not be Turing-complete, so you might as well use a popular mainstream language that's easy to sandbox, like JavaScript.

One other intriguing option in the space is CEL (https://cel.dev), also designed at Google. This targets use cases like policy engines where programs are typically small, but need to be evaluated frequently in contexts where performance matters. CEL goes beyond non-Turing-completeness, and makes it possible to statically verify that a program's time and space complexity are within certain bounds. This, combined with the lack of I/O facilities, makes it safe to run an adversary's CEL program outside a sandbox.

layer8 6 months ago

Non-Turing-completeness doesn’t buy you that much, because you can still easily multiply runtime such that it wouldn’t terminate within your lifetime. With just map you can effectively build the cross product of a list with itself. Do that in an n-times nested expression (or nested, non-recursive function calls), and for a list of length k the result is a list of length kⁿ. And with reduce you could then concatenate a string with itself those kⁿ times, resulting in a string (and likely runtime and memory usage) of length 2^kⁿ.

If you want to limit the runtime, you need to apply a timeout.

tjalfi 6 months ago

If you're interested in prior art, Ian Currie's NewSpeak was an attempt at a non-Turing complete language for safety critical systems. Most of the search results are for a different language with the same name, but "RSRE currie newspeak" should find relevant links.

Expurple 6 months ago

Idris is Turing complete [1] and has a very advanced and expressive type system (dependent types) where type checking is still guaranteed to halt.

That's because the language has a notion of `total` functions [2], and only `total` functions can be used for computing type signatures. These `total` functions must terminate in finite time and not crash. AFAIK, they aren't Turing complete, but they're still pretty expressive. `partial`, Turing complete functions are still allowed at runtime (outside of type signatures) [1].

If I understand correctly, running Idris in the type-checking mode (`--check`) should give you what you want.

[1]: https://cs.stackexchange.com/a/23916/159425

[2]: https://idris2.readthedocs.io/en/latest/tutorial/typesfuns.h...

BriggyDwiggs42 6 months ago

Could be a good idea for a multiplayer ingame scripting language.