I work on a deployment tooling team. In 2019/2020 we did a deep dive into Dhall vs. Jsonnet^1 for standardizing config and kubernetes templating across my company (Zendesk). We ended up going with Jsonnet (although some Dhall evangelists in the company have kept the dream alive!), which I think is a more approachable language for many, but Dhall has a lot of cool features and good things going for it.
Jsonnet is far from perfect, but it gets the job done pretty well and has been relatively easy for engineers across the company to pick up.
That said, after a 3+ year journey, the shortfalls in our original design have become more noticeable and we're giving more thought to writing a more robust tool using a more traditional language like Go to solve some of our configuration/deployment and data templating problems.
Cue^2 is something I'm keeping an eye on these days as well.
> In 2019 Aprial, there is 180M lines of BCL, while C++/Java sits at ~300M.
It's... not obvious to me that that's a good thing? The ratio of configuration code to code in the things being configured being that close makes me think that BCL is something that's ill-suited to what it's now being asked to do but there's too much of it to realistically replace.
Maybe it's amazing and the problem it's solving is so complicated that even a language designed very well for solving that problem leaves you needing a lot of it, but I don't really consider "a staggeringly large amount of code has been written in this language" to necessarily be an endorsement of that language's quality. Citing Google's 300M lines of Java would also be a silly reason to pick Java in a project where you'll never interact with the Google ecosystem.
Without knowing anything about Cue, just be careful with choosing to a technology only because it scales to the largest infra on earth, it doesn’t mean it scales equally good for smaller deployments.
Yeah, I really like what I've seen from CUE, and obviously the core team behind it is the real deal. I've never worked at Google but I've done quite a bit of research into Borg/BCL. AFAIK Jsonnet is basically a direct decedent of BCL, whereas CUE is the next evolution that tries to fix what BCL got wrong.
CUE was in its infancy when we were evaluating Jsonnet (and I wasn't even familiar with it at the time). If we were picking a data templating language today there's a good chance we would choose it. We very well may end up migrating to it or incorporating it in some way.
Whilst I like the look of Cue... this argument reminds me of the Kubernetes trap
( We should use it... because Google does it and they're huuuuuuuge don't you know etc )
This is actually one of my main concerns with CUE and something I'd like the community overall to keep in mind. There is a high chance in my opinion that CUE ends up being adopted as a silver bullet for configuration just for being cargo culted as something that is from Google and the drive for dependence by open source users sometimes, sort of similar to what happened with Kubernetes. That modules and package management in CUE are modeled after what Go did is also questionable IMO but makes sense since the project is Go based.
CUE is a good tool for validation and has its uses, but Dhall has some really neat innovations that make it an exciting project so worth looking at both at least and compare first.
Just my 2cents - ehen I started at current employer, we had a huge, convulted dhall project for kube. We ended up switching to a real language (python in our case due to reasons, Go is a more correct choice) and are very pleased with the results.
I thought the same thing. As a Node.js dev, I can quickly create a .mjs file and use a `module.export` to return a JavaScript object who contains the configuration. Thus, i can use template string and/or function to do what I want. I can even load JSON files natively
I like Dhall for what it gives you with type safer etc but the compile times are extremely slow on a large project and type signatures can get real long/hard. It is bit of a Dhall monolith though ;)
> we're giving more thought to writing a more robust tool using a more traditional language like Go to solve some of our configuration/deployment and data templating problems.
You might want to give Pulumi (no affiliation) a look. I've been using it with Typescript, but it supports Go, too.
jsonnet seemed like a great idea to me, but I've experienced extremely low performance. My Kubernetes manifest with a couple hundred lines of code in 4 files, which rendered instantly in Python before and renders instantly now with Helm, would take 10-20s with jsonnet. The "lazy evaluation" would probably go into some quadratic or exponential behavior, evaluating the same thing many times, but I sure couldn't see why (it was very straightforward code) or where (no debug tools...).
I really wouldn't use it for anything until tooling improves (but then against I'd much rather use something like Starlark).
> jsonnet seemed like a great idea to me, but I've experienced extremely low performance.
Although each implementation of jsonnet has some quirks, take a look at sjsonnet^1 (scala-based) or go-jsonnet^2 for improved performance. We generally prefer go-jsonnet.
There's also a Rust version^3 that claims to be the fastest yet^4, but I haven't experimented with it at all.
ytt lets you embed logic via a python-subset (starlark) and also provides "overlays" as a "replace/insert" mechanism. and all valid ytt files are valid yaml files, so they can be passed-through other yaml parsing stages.
Thanks for sharing! We've done a little experimenting with ytt (we already use several tools from Carvel/k14s, mainly vendir and kbld), but it's been a while...probably a year or more...since I've played with it. Need to give it another look.
Relying on configuration files to glue together a huge stack of tools and services in order to produce an end product is a huge mistake. I say it's a mistake fully knowing that nearly everyone in the industry is doing it.
Configuration is opaque. Undebuggable. Unmaintainable. By its very nature. No matter what "language" you use for it.
You should strive to keep configurable things to an absolute minimum.
Although it's worth making a distinction between "configurations" and "settings": when something is configured wrong, the application basically fails to operate as intended. For settings, there can't be "wrong" settings, all settings are valid, and if the input for a specific setting was invalid, the application can simply ignore it and use the default value. Example of a "setting": font size for a text editor. Example of configuration: the connection information for a database.
Convention over configuration is really the future, but not everything follows that philosophy. For example, OCaml codebases are basically a free for all and you can organize things however you want. So with that in mind, you need something to organize and build.
Convention before configuration doesn't solve the core problem.
Why do you need configuration? Because you build your application by gluing together a multitude of different applications that need to configured properly to be able to talk to each other.
Once you understand this, the solution is obvious: don't build your application this way.
Build it in one language. If you want to "reuse" existing solutions, reuse them as libraries, not as separate programs.
> Configuration is opaque. Undebuggable. Unmaintainable. By its very nature. No matter what "language" you use for it.
What makes you think so? How do you define configuration?
I mean, in the extreme case you can see Python as a configuration language for the behaviour of the Python interpreter. Does that make Python a configuration language? Where do you draw the line?
Configuration is some sort of information that is stored in a text file, for reading by one or more programs, where these programs cannot meaningfully operate without this critical information in the configuration files.
If you have ever worked in the web startup industry, their use is endemic in everything. You often cannot deploy any web applciation without properly understanding tons of configuration files.
Because everyone builds applications out of services that must be glued together, and these services often glue themselves together via configuration files. The python side of the application code needs to figure out how to connect to all the databases (postgres, redis, elastic, memcached ... etc). Each of these databases sometimes needs its own configuration before it can meaninfully operate. And because everyone is using Docker, you need docker files for all of your services. The list of things that need text based configuration fiels goes on and on.
Go to any web company and ask around to see who can meaningfully understand and edit or maintain these configuration files, and you will find the majority of the people have no idea what's going on inside them. It's very common that only two people have a decent understanding of all the configs.
Now to answer your core question:
Why are they opaque and undebuggable?
Because to understand the configuration files, you have to understand the programs they are inteded for and the environments they end up in, and the language they are written in.
For example, to understand a three-line docker compose configuration for ha_proxy, you have to understand the language of docker compose, and the way configuration files are aggregated. For example, you can define some VARIABLES in one config files and have them be available for reading by another docker compose config file in a totally different place. But you have to understand under what conditions which files are included together. For instance, you can have 10 files define different values for one VARIABLE and each of them runs in a different environment. You have to understand all of this. And we haven't even gotten to ha_proxy yet. Now you have to understand how ha_proxy is influenced by the configurations you have specified. Let's say it's telling it about another server to connect to, or a directory to serve static assets from. The path of this directory depends entirely on how other docker images are composed together, which in turn is usually influenced by the contents of many other docker compose config files that are - again - spread out across many files. You need to figure out _which_ config file is used to make a certain directory available at a certain location, etc etc.
There's no tool that can debug this mess. There's no substitute for deep and intimiate knowledge about all the details of the system.
It's very possible for the whole system to collapse when you make one tiny small mistake in one configuration file. There's no tool that can help you debug what's going on.
Dhall is my favorite configuration language that I never get around to using.
I manage DNS in Terraform, and since every Terraform provider uses different objects definitions, and every object definition is rather verbose, Dhall would be a way to specify my own DRY types and leave the provider-specific details in one place. Adding new DNS entries and moving several domains between providers would be a matter of changing fewer lines.
tanka helped me make peace with k8s (yaml) or at least made for a good learning environment for k8s than the other options. Wonder about other peoples experience or their rite of config tool passage
We use https://github.com/octodns/octodns for some of our DNS records. It's flexible, much faster than Terraform for thousands of records, and the maintainer Ross has been responsive on issues and pull requests. Also see Cloudflare's blog for how they use it
I think Dhall is good at pushing forward some ideas, but honestly I feel like Skylark (Python but not turing complete) just feels like the right way forward. Being able to specify dynamic functionality in a configuration file, when paired with a good configuration API, really makes stuff straightforward IMO. Gunicorn is the best example of this.
We have all of these tools that try to propose declarative configuration, then run into the fact that people really do want dynamic systems with some abstraction capabilities, and then have to overlay that into their systems.
I feel increasing alone in this position but I absolutely hate those Python-like languages.
They look like Python, and they occasionally act like Python. But occasionally they don't, and the mask of looking like Python obscures those times. Then you never really know which Python facilities you have and which ones you don't.
This opinion is ironic because this is exactly what the author intended to describe -- the typo IS hard to find.
The "Hello, World" example is nothing more than JSON, showing a string repeated 3 times. On the third time, "bill" is misspelled with "blil". The next tab shows how Dhall uses a variable definition to prevent this type of error.
I too spent the whole time scrutinizing the syntax and ignoring the values, since this is an intro to the language and not to Bill’s dotfiles. I was ultimately unable to deduce the presumed syntax error and independently came to the same misconclusion.
I mean... from context, it's probably a mistake, but on the other hand we don't actually know the structure of that filesystem or the author's intentions.
Meh, I think you're pointing out why doing this with a variable is better. In other words that mistake is the entire point of the example. Perhaps the definitions tab could make it more clear what the mistake was.
Yeah, I spent 3 minutes looking for a syntax error in 5 lines of code. I couldn’t find anything. I’m obviously too dumb to figure out this language so I guess I’ll just stick with YAML.
I like a lot of the ideas in dhall, but but I disagree with some of the design decisions.
The syntax for objects with dynamic keys seems unnecessarily verbose.
The arithmetic capabilities are very restricive. No subtraction, division, or modulo. Addition and multiplication only work on Natural. No numeric comparison. = And != Only work on booleans. There isn't really much motivation given for why it is so restricted.
Maybe I'm missing something, but it seems like if a type has a lot of optional fields, which is pretty common, you have to explicitly pass None for all of them. Maybe the pattern is to merge a default record with a smaller record that has what you actually want? But I didn't see any examples of that. Also, how do you deal with cases where null, and the absence of a value are treated differently? For example if leaving the value off means use the default and null means to turn a feature off. From what I can tell optinals are either always null or always absent when None.
It also seems like it would be annoying to have to specify types whenever calling functions like List/length.
IDK, maybe in practice these aren't as bad as they seem.
> Maybe I'm missing something, but it seems like if a type has a lot of optional fields, which is pretty common, you have to explicitly pass None for all of them.
As much as I love Dhall, I think it's actually better to allow any turing complete programming language and require it to generate the config in a certain format (whatever it is, maybe even dhall).
The only thing is needed to run the config-generation process in a sandbox and restrict how long it can run. If it takes too long then we stop it - the same we should do with a Dhall config as well, even if Dhall theoretically will finish in a finite amount of time.
That way everyone can use their preferred language to describe and maintain the configuration. We just have to agree on the format that is being spit out at the end, such as json, yaml, dhall or even a proprietary format.
How so? Normally, generating a config, even a big one, is a matter of milliseconds even for slow languages.
We are not talking about applying the config - we are just talking about using some inputs (variables) and generating a string / text-file that we return to the entitiy that then validates and uses the config.
I love seeing useful languages that aren’t turing complete. Tooling gets much more interesting when you don’t have so many impossibility results around all the interesting stuff.
What practical advantages are there? Someone can always write a program that runs until the heat death of the universe even in a Turing-incomplete language.
The major advantage of a language that isn't Turing complete is not having the major risk inherent to Turing complete languages: asking if any non-trivial program will produce any given result or behavior is undecidable[1].
> write a program that runs until the heat death of the universe even in a Turing-incomplete language.
The Halting Problem is just a simple example of program behavior. The undecidability extends to any other behavior. Asking if a given program will behave maliciously is still undecidable even if we only consider the set of programs that do halt in a reasonable amount of time.
When you are using a regular language or deterministic pushdown automata, questions about the behavior or even asking if two implementations are equivalent is decidable. It is at lest possible8 to create software/tools to help answer the question "is this input safe." When you use a non-deterministic pushdown automata or stronger, you problem becomes provably undecidable*,
I highly recommend the talk "The Science of Insecurity"[2].
In theory that's true for many of the kinds of Turing-incomplete languages we care about. (Eg it's not true for JPG or for (proper) regular expressions.)
From Dhall's docs (emphasis mine):
> Note that a “finite amount of time” can still be very long. For example, there are some short pathological programs that take longer than the heat death of the universe to evaluate. The main benefit of evaluation being finite is not to eliminate long-running programs but to make them significantly less probable. In practice, you will discover that you will rarely author a configuration file that takes a long time to evaluate by accident.
> For example, Dhall does not provide language support for recursion. If you try to define a recursive expression or function you will get a type error. Lists are the only recursive data structure and the only way to build or consume lists is through safe primitives guaranteed to terminate, like List/fold. This restrictive programming style keeps code simple and makes expensive code more obvious (both to the code author and reviewer).
Totality checking like that in Idris is one nice example. That's kind of circular though - it isn't Turing complete because it has totality checking (i.e. you must prove the program terminates).
I absolutely loved using this tool to generate many instances of Datadog resources in terraform json. Not only did it simplify a nasty mess of dynamic typed and schema-less bash/python/mako, it taught me a lot of functional programming paradigms along the way.
I've always wanted an excuse to use this but never had a reason. The closest I come is using Nix, and I know there is a Dhall-to-Nix compiler, but Dhall can't represent everything possible in Nix I haven't seen any good reason to use it.
Dhall is advertised as a configuration language but you can do a tad more. I use it in my blog-engine to fit a use-case I found was poorly addressed by other approaches: small-cardinality datasets that benefit from type-checking and templating (e.g., list of notes, a photo gallery). I don't claim the idea is especially novel but I found the use case rare and interesting enough to write some explanations, design, and a demo here:
https://lucasdicioccio.github.io/dhall-section-demo.html
I wonder if a version of Python with types with some way to force programs to be total would be a 'good enough' substitute. JavaScript has `"use strict";`, `# use total` could ban recursion and loops over non-constants. You'd have to work hard to ensure things were really total, e.g. ban assigning functions to variables, but maybe you could make it hard enough to pay off.
Starlark is a version of this, and it's used by Bazel in particular. It's very cool tech, and sidesteps the difficulties of ensuring Python works without shipping a specific Python interpreter.
I have been writing a fair amount of Dhall using autogenerated CloudFormation bindings ( https://github.com/jcouyang/dhall-aws-cloudformation/ ). It is a fantastic way to reduce boilerplate and factor out recurring blobs. My main frustration is that the type checker is not smart enough (or maybe the type system is undecidable?) - every time you want to use a polymorphic function, you must pass in the type parameters yourself (this is also true for empty lists and `None`). This makes simple FP idioms extremely noisy, to the point where you're better off writing longhand. In a language that's meant to be alleviating YAML/JSON boilerplate.
It's still a massive improvement, but it could be so much better if the typechecker was smarter.
They are different to Dhall. Puppet, Anisble and Salt are task runners. You give them configuration and they parse the configuration and then run tasks on a destination machine based on the configuration.
Dhall is a tool to generate configuration such as json, yams, xml. It’s useful if you have *complex* configurations such as a AWS Cloudformation stack or Kubernetes yaml files.
Ah ok .. thanks for that details. Puppet was also called a configuration manager and it had the capability to generate configuration files of any format in a dynamic way (via support of variables, parameters etc).
I loved my experience with Dhall from the PureScript community. It prompted me to learn a lot more and I built some small libraries too. My experience was mostly positive. Supporting Unicode is the cherry on top.
Of all my gripes, only two still stick in my mind:
• No JavaScript implementation; I want to be able to use it more places, but I always end up having to convert it to JSON to use it
• the built-in formatter is way too aggressive; I'm fond of the never-contract-only-expand formatters that don't pressure me to collapse things I don't want and the way it is isn’t to optimize Git diffs where conflicts can arise.
Dhall needs to provide official bindings to all major platforms (these days that’s JS, Python, .NET, JVM). It looks like a really cool project but I don’t use Haskell or Ruby.
This is a by-product of Dhall coming from the Haskell community where this formatting was preferred instead of final trailing commas. It has been internalised and was carried on. I say this as someone who looks at this code and thinks "This is totally fine." -- but I admit, I'm environmentally damaged.
Yes. And, alas, you can't just allow a final trailing comma everywhere in Haskell.
Eg ("Foo", 2,) is different from ("Foo", 2) in Haskell thanks to TupleSections. For innocent bystanders: in Haskell ("Foo", 2) is the tuple you'd expect it to be. But ("Foo", 2,) is a function that takes another argument and creates a three-tuple. A Python equivalent would be
I much prefer this style, I use it wherever I can because I can comment out lines without a trailing comma causing a syntax error. SQL, Ruby… wherever I can.
Try it, I doubt you'll go back (unless someone's stupid parser doesn't let you).
Yeah, I also really dislike this mixing up of indentation levels because of a language limitation. The "{" denotes a container, and the "," a separation between items of said container, for me it makes sense for them to be nested a level deeper
That's one way to interpret things, but not the only way.
In practice, this style reads just fine once you get used to it. Indentation is mostly there as a human convenience, so just has to work well with human brains (and be understood by a computer, if it's significant), but doesn't have to necessarily follow some abstract unified theory of syntax trees.
both symbols are delimiters, one can choose to align them together even when they are not the same character. Opening and closing braces aren't the same character either, but people have been aligning them for ages, I don't see a reason why commas, while being part of the same expression, should not follow the same principle.
I have been recently looking for a program that does a similar thing, ie simplifies creation of configs. And I did not find a suitable one by Googling. Then I realized the PHP is perfect for that and I already know it! The output of PHP does not have to be HTML or other web stuff, instead it can be a config file.
I too don't want to or would use a bash script as a config file. That's the point of dhall: you get good static types and it's not turing complete, so you get a lot of the safety and code reusability tools without the pitfalls of a general purpose programming language.
And, have you ever worked with giant configs in json or yaml? It becomes incredibly painful to manage.
My biggest pain when doing this is: set up an environment configuration. Then set up a test or stage environment that matches it. Then modify the environment over the next few years keeping the stage and production environment configs in sync so that testing and validation are useful (the stage configuration actually matches production so that success or failure in staging predicts the same in production) before going to production.
Even something as simple as variable substitution becomes very useful in these cases so that configuration can be a single file and substitution delivers the staging or production config. Functions and operators allow more complicated configurations to remain DRY. Checks prevent the stage servers from using the production database connection string, etc.
People tend to evaluate these things based on what cool and neat things they can do it with. Not whether telling people they need to learn Haskell before they can update what would've been a 30 line configuration file on the project they just joined is a good use of peoples' time.
I think more programming power is useful in the service of stopping bugs; for instance, you could have a language where the part that "does stuff" isn't Turing complete, but the part that "stops incorrect stuff from happening" (the type system) is.
Don't. Use helm, kustomize or a decent code language.
Dhall will constrict, slow you down, make onboarding a nightmare, and ultimately be as brittle as other alternatives (Only it's harder to find where it broke).
I cannot advocate against dhall enough.
this is what I thought too. I've enjoyed working with helm and totally recommend it but was wondering if I'd missed something. A templating language and not an actual programming language seems to be the right balance for config
I said this above as well: ytt (https://carvel.dev/ytt/) lets you embed starlark into valid yaml, among other cute tricks for managing biz-logic in configs.
Most of the time with configuration files, you want to be able to do at least basic validation on the config without having to run the whole piece of software. Python has no native typing, so there's nothing to stop the user from creating a configuration file where, for example, they have set a field that is intended to specify a port number, as a string. This error will become obvious when you run the software (and it presumably fails some way in) but for some use cases this is too late. With dhall you can specify that the field for the port number needs to not just be an integer, but be a natural number, so you're able to catch plenty of config errors nice and early, and can go some way to validating the config in isolation from the software it's intended to be used for.
I believe you may want a non-turing complete language for strict configuration.
But sometimes, as you indicate, you want VERY dynamic configuration. But I would argue then that such logic goes in your application itself and is not in fact part of your configuration.
Sometimes you just explicitly want to output text or a structured format, but you'd rather have a template language do it because it being a lot less powerful actually makes it easier to write close to the output format.
I really hope that one day Google can open source GCL, which so many languages have been inspired by without ever really being as good as it is. They all claim to address pitfalls in GCL but at the end of the day they're just worse than GCL.
Can you say more about what GCL does better than all of the open source ones?
Anecdotally, I've heard a lot of GCL horror stories, and many Xooglers have chosen to create things like Jsonnet or Skycfg (https://github.com/stripe/skycfg) instead.
It creates nested data without looking like the end format of the data (why I really can't get into jsonnet), it's just obviously its own language but is pretty minimal.
Despite the minimalism it is not necessarily simple. There are features like inheritance and late binding. They can be quite complicated, but the thing that really stands out about GCL is it's not complicated to use. The simple cases are easy, straightforward, and readable. The complicated cases are made possible.
I've used it extensively to store and transform data that has to be manually edited by humans. It's a really great format because I can define all the translation rules which can get pretty complicated, but the complexity I expose to the humans who are writing the data is minimal. But if they need to do some transformations on the data or even just, like, string substitutions...the language is there and they can use it. That's what makes writing your data in a configuration language nice.
(Honestly though I think jsonnet is at least vastly superior to skycfg and the whole starlark ecosystem. I swear I'm so sick of languages that intentionally look like Python without being actual Python!)
> However, the early design of GCL went for something simpler that coincidentally was also incompatible with the notion of graph unification. This simpler approach proved insufficient, but it was already too late to move to the earlier foreseen approach. Instead, an inheritance-based override model was adopted. Its complexity made the earlier foreseen tooling intractable and they never materialized. The same holds for the GCL offsprings that copied its model.
Needless to say I disagree with the Cue author. I think the inheritance-based override model is fantastic and has made for a great and straightforward configuration language.
And yet, even if they did it right now, it'll be over there in the pile of "yet another configuration language" for the same reason we already have like 50 of them
Why not instead use one of the popular programming languages (C#, Java, ...), possibly with some dedicated library built on top of it? Also, reminds me of the Heptagon of Configuration [1].
I work on a deployment tooling team. In 2019/2020 we did a deep dive into Dhall vs. Jsonnet^1 for standardizing config and kubernetes templating across my company (Zendesk). We ended up going with Jsonnet (although some Dhall evangelists in the company have kept the dream alive!), which I think is a more approachable language for many, but Dhall has a lot of cool features and good things going for it.
Jsonnet is far from perfect, but it gets the job done pretty well and has been relatively easy for engineers across the company to pick up.
That said, after a 3+ year journey, the shortfalls in our original design have become more noticeable and we're giving more thought to writing a more robust tool using a more traditional language like Go to solve some of our configuration/deployment and data templating problems.
Cue^2 is something I'm keeping an eye on these days as well.
[1]: https://jsonnet.org/
[2]: https://cuelang.org/
CUE's author invented Borg configuration language or BCL since 2008.
BCL code is the 3rd largest human written code in Google internal code base. In 2019 Aprial, there is 180M lines of BCL, while C++/Java sits at ~300M.
BCL configuration's large scale use probably is beyond any other infra as code use cases known to human.
And the learning and ideas over more than a decade, is manifested in CUE.
Personally, this is enough to convince me to comfortably ignore anything else on the market.
> In 2019 Aprial, there is 180M lines of BCL, while C++/Java sits at ~300M.
It's... not obvious to me that that's a good thing? The ratio of configuration code to code in the things being configured being that close makes me think that BCL is something that's ill-suited to what it's now being asked to do but there's too much of it to realistically replace.
Maybe it's amazing and the problem it's solving is so complicated that even a language designed very well for solving that problem leaves you needing a lot of it, but I don't really consider "a staggeringly large amount of code has been written in this language" to necessarily be an endorsement of that language's quality. Citing Google's 300M lines of Java would also be a silly reason to pick Java in a project where you'll never interact with the Google ecosystem.
6 replies →
Without knowing anything about Cue, just be careful with choosing to a technology only because it scales to the largest infra on earth, it doesn’t mean it scales equally good for smaller deployments.
For the record, when I was an SRE at Google, most people I talked to hated BCL with a passion.
6 replies →
Yeah, I really like what I've seen from CUE, and obviously the core team behind it is the real deal. I've never worked at Google but I've done quite a bit of research into Borg/BCL. AFAIK Jsonnet is basically a direct decedent of BCL, whereas CUE is the next evolution that tries to fix what BCL got wrong.
CUE was in its infancy when we were evaluating Jsonnet (and I wasn't even familiar with it at the time). If we were picking a data templating language today there's a good chance we would choose it. We very well may end up migrating to it or incorporating it in some way.
Whilst I like the look of Cue... this argument reminds me of the Kubernetes trap ( We should use it... because Google does it and they're huuuuuuuge don't you know etc )
This is actually one of my main concerns with CUE and something I'd like the community overall to keep in mind. There is a high chance in my opinion that CUE ends up being adopted as a silver bullet for configuration just for being cargo culted as something that is from Google and the drive for dependence by open source users sometimes, sort of similar to what happened with Kubernetes. That modules and package management in CUE are modeled after what Go did is also questionable IMO but makes sense since the project is Go based. CUE is a good tool for validation and has its uses, but Dhall has some really neat innovations that make it an exciting project so worth looking at both at least and compare first.
CUE talks about unification, is that similar to prolog meaning of that term ?
1 reply →
Just my 2cents - ehen I started at current employer, we had a huge, convulted dhall project for kube. We ended up switching to a real language (python in our case due to reasons, Go is a more correct choice) and are very pleased with the results.
I thought the same thing. As a Node.js dev, I can quickly create a .mjs file and use a `module.export` to return a JavaScript object who contains the configuration. Thus, i can use template string and/or function to do what I want. I can even load JSON files natively
I like Dhall for what it gives you with type safer etc but the compile times are extremely slow on a large project and type signatures can get real long/hard. It is bit of a Dhall monolith though ;)
> we're giving more thought to writing a more robust tool using a more traditional language like Go to solve some of our configuration/deployment and data templating problems.
You might want to give Pulumi (no affiliation) a look. I've been using it with Typescript, but it supports Go, too.
> a more robust tool using a more traditional language like Go
I heard you liked configuration languages, so I wrote a configuration language for your configuration language.
How long before you stick a Yaml config into your Go configuration library for easier maintenance? And so the cycle continues.
"Dhall... that you can think of as: JSON + functions + types + imports". Is this Typescript?
5 replies →
jsonnet seemed like a great idea to me, but I've experienced extremely low performance. My Kubernetes manifest with a couple hundred lines of code in 4 files, which rendered instantly in Python before and renders instantly now with Helm, would take 10-20s with jsonnet. The "lazy evaluation" would probably go into some quadratic or exponential behavior, evaluating the same thing many times, but I sure couldn't see why (it was very straightforward code) or where (no debug tools...).
I really wouldn't use it for anything until tooling improves (but then against I'd much rather use something like Starlark).
> jsonnet seemed like a great idea to me, but I've experienced extremely low performance.
Although each implementation of jsonnet has some quirks, take a look at sjsonnet^1 (scala-based) or go-jsonnet^2 for improved performance. We generally prefer go-jsonnet.
There's also a Rust version^3 that claims to be the fastest yet^4, but I haven't experimented with it at all.
[1]: https://github.com/databricks/sjsonnet
[2]: https://github.com/google/go-jsonnet/
[3]: https://github.com/CertainLach/jrsonnet
[4]: https://gist.github.com/CertainLach/5770d7ad4836066f8e0bd91e...
1 reply →
just another 'have you looked at' : https://carvel.dev/ytt/
ytt lets you embed logic via a python-subset (starlark) and also provides "overlays" as a "replace/insert" mechanism. and all valid ytt files are valid yaml files, so they can be passed-through other yaml parsing stages.
Thanks for sharing! We've done a little experimenting with ytt (we already use several tools from Carvel/k14s, mainly vendir and kbld), but it's been a while...probably a year or more...since I've played with it. Need to give it another look.
CUE is far superior to Jsonnet from my experience. The Validation and abstractions feel much more native
> using a more traditional language like Go to solve some of our configuration/deployment and data templating problems.
Have you looked at https://github.com/kris-nova/naml ? :)
I had not seen that but it looks super interesting and potentially useful! Thanks for sharing!
Relying on configuration files to glue together a huge stack of tools and services in order to produce an end product is a huge mistake. I say it's a mistake fully knowing that nearly everyone in the industry is doing it.
Configuration is opaque. Undebuggable. Unmaintainable. By its very nature. No matter what "language" you use for it.
You should strive to keep configurable things to an absolute minimum.
Although it's worth making a distinction between "configurations" and "settings": when something is configured wrong, the application basically fails to operate as intended. For settings, there can't be "wrong" settings, all settings are valid, and if the input for a specific setting was invalid, the application can simply ignore it and use the default value. Example of a "setting": font size for a text editor. Example of configuration: the connection information for a database.
Wholeheartedly agree - convention before configuration wins with tooling even at modest scale.
Larger team(s) using free form yaml configuration and templating quickly becomes messy.
Setting a boundary at “configuration” and exposing configuration options to dev teams through settings works really well.
It requires you to have a tooling team with a few developers, but it scales!
Convention over configuration is really the future, but not everything follows that philosophy. For example, OCaml codebases are basically a free for all and you can organize things however you want. So with that in mind, you need something to organize and build.
2 replies →
One problem with convention is that it isn't discoverable. At least config files guide you to where to look for stuff.
But even so I think you're probably right that convention is better because it forces everyone to use the same structure for stuff.
Convention before configuration doesn't solve the core problem.
Why do you need configuration? Because you build your application by gluing together a multitude of different applications that need to configured properly to be able to talk to each other.
Once you understand this, the solution is obvious: don't build your application this way.
Build it in one language. If you want to "reuse" existing solutions, reuse them as libraries, not as separate programs.
1 reply →
> Configuration is opaque. Undebuggable. Unmaintainable. By its very nature. No matter what "language" you use for it.
What makes you think so? How do you define configuration?
I mean, in the extreme case you can see Python as a configuration language for the behaviour of the Python interpreter. Does that make Python a configuration language? Where do you draw the line?
Configuration is some sort of information that is stored in a text file, for reading by one or more programs, where these programs cannot meaningfully operate without this critical information in the configuration files.
If you have ever worked in the web startup industry, their use is endemic in everything. You often cannot deploy any web applciation without properly understanding tons of configuration files.
Because everyone builds applications out of services that must be glued together, and these services often glue themselves together via configuration files. The python side of the application code needs to figure out how to connect to all the databases (postgres, redis, elastic, memcached ... etc). Each of these databases sometimes needs its own configuration before it can meaninfully operate. And because everyone is using Docker, you need docker files for all of your services. The list of things that need text based configuration fiels goes on and on.
Go to any web company and ask around to see who can meaningfully understand and edit or maintain these configuration files, and you will find the majority of the people have no idea what's going on inside them. It's very common that only two people have a decent understanding of all the configs.
Now to answer your core question:
Why are they opaque and undebuggable?
Because to understand the configuration files, you have to understand the programs they are inteded for and the environments they end up in, and the language they are written in.
For example, to understand a three-line docker compose configuration for ha_proxy, you have to understand the language of docker compose, and the way configuration files are aggregated. For example, you can define some VARIABLES in one config files and have them be available for reading by another docker compose config file in a totally different place. But you have to understand under what conditions which files are included together. For instance, you can have 10 files define different values for one VARIABLE and each of them runs in a different environment. You have to understand all of this. And we haven't even gotten to ha_proxy yet. Now you have to understand how ha_proxy is influenced by the configurations you have specified. Let's say it's telling it about another server to connect to, or a directory to serve static assets from. The path of this directory depends entirely on how other docker images are composed together, which in turn is usually influenced by the contents of many other docker compose config files that are - again - spread out across many files. You need to figure out _which_ config file is used to make a certain directory available at a certain location, etc etc.
There's no tool that can debug this mess. There's no substitute for deep and intimiate knowledge about all the details of the system.
It's very possible for the whole system to collapse when you make one tiny small mistake in one configuration file. There's no tool that can help you debug what's going on.
2 replies →
Dhall is my favorite configuration language that I never get around to using.
I manage DNS in Terraform, and since every Terraform provider uses different objects definitions, and every object definition is rather verbose, Dhall would be a way to specify my own DRY types and leave the provider-specific details in one place. Adding new DNS entries and moving several domains between providers would be a matter of changing fewer lines.
Dhall also has Kubernetes bindings:
https://github.com/dhall-lang/dhall-kubernetes
Although I'm tempted to just stick to Helm here: even though it's less type-safe, Dhall's verbosity makes me reconsider.
I'd like to hear if anyone has used dhall-kubernetes if they like it.
I use tanka/jsonnet and cringe everytime I read a helm chart. Type safety would be nice, but the k8s api can verify the validity on the server.
https://tanka.dev/
Tanka is what I want to use when the time comes.
Are there any pitfalls you have learned to avoid?
1 reply →
tanka helped me make peace with k8s (yaml) or at least made for a good learning environment for k8s than the other options. Wonder about other peoples experience or their rite of config tool passage
We use https://github.com/octodns/octodns for some of our DNS records. It's flexible, much faster than Terraform for thousands of records, and the maintainer Ross has been responsive on issues and pull requests. Also see Cloudflare's blog for how they use it
I think Dhall is good at pushing forward some ideas, but honestly I feel like Skylark (Python but not turing complete) just feels like the right way forward. Being able to specify dynamic functionality in a configuration file, when paired with a good configuration API, really makes stuff straightforward IMO. Gunicorn is the best example of this.
We have all of these tools that try to propose declarative configuration, then run into the fact that people really do want dynamic systems with some abstraction capabilities, and then have to overlay that into their systems.
I feel increasing alone in this position but I absolutely hate those Python-like languages.
They look like Python, and they occasionally act like Python. But occasionally they don't, and the mask of looking like Python obscures those times. Then you never really know which Python facilities you have and which ones you don't.
Is skylark (now called starlark btw) general purpose? Its my understanding that its largely used for configuring bazel.
It is used by other tools like Tilt https://tilt.dev/ (a Kubernetes dev tool)
https://github.com/google/starlark-go
Comparison of similar projects, including Dhall:
https://github.com/oilshell/oil/wiki/Survey-of-Config-Langua...
> Can you spot the mistake?
Nope, so now I have no incentive to use your config format because it's established something is wrong and it's completely non-obvious.
Thanks for not wasting my time, I guess.
This opinion is ironic because this is exactly what the author intended to describe -- the typo IS hard to find.
The "Hello, World" example is nothing more than JSON, showing a string repeated 3 times. On the third time, "bill" is misspelled with "blil". The next tab shows how Dhall uses a variable definition to prevent this type of error.
I too spent the whole time scrutinizing the syntax and ignoring the values, since this is an intro to the language and not to Bill’s dotfiles. I was ultimately unable to deduce the presumed syntax error and independently came to the same misconclusion.
2 replies →
I took it to mean "can you find the syntax error". It wasn't clear they meant the text
1 reply →
Sure, but how do I know that the user doesn't store their public key in /home/blil?
I mean... from context, it's probably a mistake, but on the other hand we don't actually know the structure of that filesystem or the author's intentions.
They are trying to tell a story about how Dhall helps you avoid hard to notice errors, but they aren't doing a great job.
Would be cool if they had a three panel story:
* using json with the typo
* using dhall with the typo
* using dhall with variables to avoid the typo
Might be easier to understand.
Meh, I think you're pointing out why doing this with a variable is better. In other words that mistake is the entire point of the example. Perhaps the definitions tab could make it more clear what the mistake was.
Yeah, I spent 3 minutes looking for a syntax error in 5 lines of code. I couldn’t find anything. I’m obviously too dumb to figure out this language so I guess I’ll just stick with YAML.
How could they make the mistake obvious? It's a mistake in a string value!
It had me confused too. I figured I was looking for a syntax error in the config format since it’s an introduction…to the config format.
How about not making the very first thing on the page a confusing puzzle?
2 replies →
Previous thread w/ many comments: https://ucg.marzhillstudios.com
I like a lot of the ideas in dhall, but but I disagree with some of the design decisions.
The syntax for objects with dynamic keys seems unnecessarily verbose.
The arithmetic capabilities are very restricive. No subtraction, division, or modulo. Addition and multiplication only work on Natural. No numeric comparison. = And != Only work on booleans. There isn't really much motivation given for why it is so restricted.
Maybe I'm missing something, but it seems like if a type has a lot of optional fields, which is pretty common, you have to explicitly pass None for all of them. Maybe the pattern is to merge a default record with a smaller record that has what you actually want? But I didn't see any examples of that. Also, how do you deal with cases where null, and the absence of a value are treated differently? For example if leaving the value off means use the default and null means to turn a feature off. From what I can tell optinals are either always null or always absent when None.
It also seems like it would be annoying to have to specify types whenever calling functions like List/length.
IDK, maybe in practice these aren't as bad as they seem.
> Maybe I'm missing something, but it seems like if a type has a lot of optional fields, which is pretty common, you have to explicitly pass None for all of them.
You can specify defaults for you record types via Type/default pattern https://github.com/Gabriella439/dhall-manual/blob/develop/ma...
Count me as another big Dhall fan -- Dhall was a huge improvement on our codebase compared to Helm for large Kubernetes deployments.
Full disclosure: I maintain the [dhall](https://pypi.org/project/dhall/) package on PyPI
As much as I love Dhall, I think it's actually better to allow any turing complete programming language and require it to generate the config in a certain format (whatever it is, maybe even dhall).
The only thing is needed to run the config-generation process in a sandbox and restrict how long it can run. If it takes too long then we stop it - the same we should do with a Dhall config as well, even if Dhall theoretically will finish in a finite amount of time.
That way everyone can use their preferred language to describe and maintain the configuration. We just have to agree on the format that is being spit out at the end, such as json, yaml, dhall or even a proprietary format.
> The only thing is needed to run the config-generation process in a sandbox and restrict how long it can run.
That's a very brittle restriction at best, and if implemented strictly on time, it's a very flaky on as well.
How so? Normally, generating a config, even a big one, is a matter of milliseconds even for slow languages.
We are not talking about applying the config - we are just talking about using some inputs (variables) and generating a string / text-file that we return to the entitiy that then validates and uses the config.
Not sure why that would be flaky.
5 replies →
I love seeing useful languages that aren’t turing complete. Tooling gets much more interesting when you don’t have so many impossibility results around all the interesting stuff.
What practical advantages are there? Someone can always write a program that runs until the heat death of the universe even in a Turing-incomplete language.
> What practical advantages are there?
The major advantage of a language that isn't Turing complete is not having the major risk inherent to Turing complete languages: asking if any non-trivial program will produce any given result or behavior is undecidable[1].
> write a program that runs until the heat death of the universe even in a Turing-incomplete language.
The Halting Problem is just a simple example of program behavior. The undecidability extends to any other behavior. Asking if a given program will behave maliciously is still undecidable even if we only consider the set of programs that do halt in a reasonable amount of time.
When you are using a regular language or deterministic pushdown automata, questions about the behavior or even asking if two implementations are equivalent is decidable. It is at lest possible8 to create software/tools to help answer the question "is this input safe." When you use a non-deterministic pushdown automata or stronger, you problem becomes provably undecidable*,
I highly recommend the talk "The Science of Insecurity"[2].
[1] https://en.wikipedia.org/wiki/Rice%27s_theorem
[2] video: https://archive.org/details/The_Science_of_Insecurity_ slides: [pdf] https://langsec.org/insecurity-theory-28c3.pdf
1 reply →
In theory that's true for many of the kinds of Turing-incomplete languages we care about. (Eg it's not true for JPG or for (proper) regular expressions.)
From Dhall's docs (emphasis mine):
> Note that a “finite amount of time” can still be very long. For example, there are some short pathological programs that take longer than the heat death of the universe to evaluate. The main benefit of evaluation being finite is not to eliminate long-running programs but to make them significantly less probable. In practice, you will discover that you will rarely author a configuration file that takes a long time to evaluate by accident.
> For example, Dhall does not provide language support for recursion. If you try to define a recursive expression or function you will get a type error. Lists are the only recursive data structure and the only way to build or consume lists is through safe primitives guaranteed to terminate, like List/fold. This restrictive programming style keeps code simple and makes expensive code more obvious (both to the code author and reviewer).
3 replies →
Totality checking like that in Idris is one nice example. That's kind of circular though - it isn't Turing complete because it has totality checking (i.e. you must prove the program terminates).
Used this for a fairly complex, large project.
It really is Haskell for configuration language and undoubtedly superior than YAML.
But for my grug brain, function currying and the lack for loops made it hard.
Functional languages continue to be a great place to steal from but pure functional requires warping your brain quite a bit.
Maybe Tanka is more my style
Also look into Starlark or Cue
I absolutely loved using this tool to generate many instances of Datadog resources in terraform json. Not only did it simplify a nasty mess of dynamic typed and schema-less bash/python/mako, it taught me a lot of functional programming paradigms along the way.
I've always wanted an excuse to use this but never had a reason. The closest I come is using Nix, and I know there is a Dhall-to-Nix compiler, but Dhall can't represent everything possible in Nix I haven't seen any good reason to use it.
Dhall is advertised as a configuration language but you can do a tad more. I use it in my blog-engine to fit a use-case I found was poorly addressed by other approaches: small-cardinality datasets that benefit from type-checking and templating (e.g., list of notes, a photo gallery). I don't claim the idea is especially novel but I found the use case rare and interesting enough to write some explanations, design, and a demo here: https://lucasdicioccio.github.io/dhall-section-demo.html
Related:
Dhall: A Non-Repetitive Alternative to YAML - https://news.ycombinator.com/item?id=20355405 - July 2019 (177 comments)
JSON is a strange choice, with all those unnecessary commas. S-exprs / EDN would be a better choice, especially as this is a functional language.
Strange choice for what? It's just one of the compile targets.
> Dhall is a programmable configuration language that you can think of as: JSON + functions + types + imports
At the top of the main page.
See also the Hello World and other examples.
I wonder if a version of Python with types with some way to force programs to be total would be a 'good enough' substitute. JavaScript has `"use strict";`, `# use total` could ban recursion and loops over non-constants. You'd have to work hard to ensure things were really total, e.g. ban assigning functions to variables, but maybe you could make it hard enough to pay off.
Starlark is a version of this, and it's used by Bazel in particular. It's very cool tech, and sidesteps the difficulties of ensuring Python works without shipping a specific Python interpreter.
Didn't know about Starlark, that is awesome.
Assigning functions to variables doesn't have to ruin your totality.
But yeah, you'd need to carefully select your subset of the language that is both useful and total, and practical to implement.
I have been writing a fair amount of Dhall using autogenerated CloudFormation bindings ( https://github.com/jcouyang/dhall-aws-cloudformation/ ). It is a fantastic way to reduce boilerplate and factor out recurring blobs. My main frustration is that the type checker is not smart enough (or maybe the type system is undecidable?) - every time you want to use a polymorphic function, you must pass in the type parameters yourself (this is also true for empty lists and `None`). This makes simple FP idioms extremely noisy, to the point where you're better off writing longhand. In a language that's meant to be alleviating YAML/JSON boilerplate.
It's still a massive improvement, but it could be so much better if the typechecker was smarter.
Whatever happened to Puppet, Ansible or Salt frameworks? They were the darling of the devops folks 7 years back.
They are different to Dhall. Puppet, Anisble and Salt are task runners. You give them configuration and they parse the configuration and then run tasks on a destination machine based on the configuration.
Dhall is a tool to generate configuration such as json, yams, xml. It’s useful if you have *complex* configurations such as a AWS Cloudformation stack or Kubernetes yaml files.
Ah ok .. thanks for that details. Puppet was also called a configuration manager and it had the capability to generate configuration files of any format in a dynamic way (via support of variables, parameters etc).
Kubernetes ate their lunch, mostly.
No. Tools like Ansible are just wrappers around ssh when you need to run a command across many servers. They don't do anything k8s does.
1 reply →
Or Terraform or Helm or ...
this looks super useful, but its also horribly ugly.
I disagree regarding ugly, but it’s rooted in Haskell syntax fwiw.
I loved my experience with Dhall from the PureScript community. It prompted me to learn a lot more and I built some small libraries too. My experience was mostly positive. Supporting Unicode is the cherry on top.
Of all my gripes, only two still stick in my mind: • No JavaScript implementation; I want to be able to use it more places, but I always end up having to convert it to JSON to use it • the built-in formatter is way too aggressive; I'm fond of the never-contract-only-expand formatters that don't pressure me to collapse things I don't want and the way it is isn’t to optimize Git diffs where conflicts can arise.
Dhall needs to provide official bindings to all major platforms (these days that’s JS, Python, .NET, JVM). It looks like a really cool project but I don’t use Haskell or Ruby.
Has the situation improved?
This is the most awkward abuse of formatting I've seen in a long time. Just allow/require a final trailing comma, and this nonsense goes away
This is a by-product of Dhall coming from the Haskell community where this formatting was preferred instead of final trailing commas. It has been internalised and was carried on. I say this as someone who looks at this code and thinks "This is totally fine." -- but I admit, I'm environmentally damaged.
Yes. And, alas, you can't just allow a final trailing comma everywhere in Haskell.
Eg ("Foo", 2,) is different from ("Foo", 2) in Haskell thanks to TupleSections. For innocent bystanders: in Haskell ("Foo", 2) is the tuple you'd expect it to be. But ("Foo", 2,) is a function that takes another argument and creates a three-tuple. A Python equivalent would be
lambda x: ("Foo", 2, x)
Or allow a leading comma:
Or, to make it markdown-ish:
This is just to play with the idea that leading punctuation may be preferable because it all aligns in the same column.
> Or allow a leading comma
This actually works just fine.
I would actually prefer no comma's
That's the same as a leading comma.
2 replies →
This style is common in Haskell, Nix and Dhall. It also sees more "mainstream" use, e.g. in SQL.
I much prefer this style, I use it wherever I can because I can comment out lines without a trailing comma causing a syntax error. SQL, Ruby… wherever I can.
Try it, I doubt you'll go back (unless someone's stupid parser doesn't let you).
You can do the same, if your language allows a final comma.
(And that's what the comment you reply to suggests: make Dhall allow a final comma.)
2 replies →
I feel this is the superior way to format something like that. Trailing commas are nothing but an ugly hack.
Yeah, I also really dislike this mixing up of indentation levels because of a language limitation. The "{" denotes a container, and the "," a separation between items of said container, for me it makes sense for them to be nested a level deeper
That's one way to interpret things, but not the only way.
In practice, this style reads just fine once you get used to it. Indentation is mostly there as a human convenience, so just has to work well with human brains (and be understood by a computer, if it's significant), but doesn't have to necessarily follow some abstract unified theory of syntax trees.
both symbols are delimiters, one can choose to align them together even when they are not the same character. Opening and closing braces aren't the same character either, but people have been aligning them for ages, I don't see a reason why commas, while being part of the same expression, should not follow the same principle.
2 replies →
Trailing comma is ugly as sin - I love the little "house" my leading commas put my record into :)
Dhall has a lot of really cool ideas, it would be great to have semantic integrity checks on imports in other languages too.
Comparison with a related language (perhaps biased, since opinion is from creator of that language).
"Comparisons between CUE, Jsonnet, Dhall, OPA, etc." https://github.com/cue-lang/cue/discussions/669
I have been recently looking for a program that does a similar thing, ie simplifies creation of configs. And I did not find a suitable one by Googling. Then I realized the PHP is perfect for that and I already know it! The output of PHP does not have to be HTML or other web stuff, instead it can be a config file.
Apparently I'm in the minority that feels config languages should be little more than namespaces and key-value storage and should not be programmable.
I would not want to use a bash script as a config file, for example.
I too don't want to or would use a bash script as a config file. That's the point of dhall: you get good static types and it's not turing complete, so you get a lot of the safety and code reusability tools without the pitfalls of a general purpose programming language.
And, have you ever worked with giant configs in json or yaml? It becomes incredibly painful to manage.
I prefer to use multiple, smaller config files to keep them manageable.
My biggest pain when doing this is: set up an environment configuration. Then set up a test or stage environment that matches it. Then modify the environment over the next few years keeping the stage and production environment configs in sync so that testing and validation are useful (the stage configuration actually matches production so that success or failure in staging predicts the same in production) before going to production.
Even something as simple as variable substitution becomes very useful in these cases so that configuration can be a single file and substitution delivers the staging or production config. Functions and operators allow more complicated configurations to remain DRY. Checks prevent the stage servers from using the production database connection string, etc.
People tend to evaluate these things based on what cool and neat things they can do it with. Not whether telling people they need to learn Haskell before they can update what would've been a 30 line configuration file on the project they just joined is a good use of peoples' time.
I think more programming power is useful in the service of stopping bugs; for instance, you could have a language where the part that "does stuff" isn't Turing complete, but the part that "stops incorrect stuff from happening" (the type system) is.
We use it for mina (https://github.com/MinaProtocol/mina). AMA
Dhall desperately needs a DefinitelyTyped-like repo with tooling support and it's bound to explode. It'd be just too good of a solution to ignore.
Config's type is Type?
Clear as mud.
Config is a type, so it's type is Type.
Figured, but it's a pretty goofy way to introduce it after that comment.
[EDIT] I mean, you can almost "who's on first?" this.
"OK, what type do you want this to be?"
"Type."
"Yes, what type do you want it to be?"
"Type."
"Great, ok, yes, the type, what type is Config?"
"Type."
"WHAT IS THE NAME OF THE TYPE YOU WANT CONFIG TO BE???"
"Type."
flips desk
1 reply →
Tried to use it and it didn't fit my usecase: Recursive Types are a bit difficult
Looks great otherwise
what kind of software configuration may require recursive types? I've never had to encode trees in my configs so far.
An explicit representation of JSON requires recursive types.
github.com/coralogix/dhall-concourse
Yikes! Don't do it folks! Business logic belongs in the apps, not in the config.
anyone using this for kubernetes config generation? What's been your experience?
Don't. Use helm, kustomize or a decent code language. Dhall will constrict, slow you down, make onboarding a nightmare, and ultimately be as brittle as other alternatives (Only it's harder to find where it broke). I cannot advocate against dhall enough.
this is what I thought too. I've enjoyed working with helm and totally recommend it but was wondering if I'd missed something. A templating language and not an actual programming language seems to be the right balance for config
What are the advantages of this being its own language vs a DSL?
Why wouldn't you use Python as your configuration language?
Have you seen Starlark? It's not too far from that, but safer in a number of ways: https://github.com/bazelbuild/starlark
I said this above as well: ytt (https://carvel.dev/ytt/) lets you embed starlark into valid yaml, among other cute tricks for managing biz-logic in configs.
1 reply →
For one thing, Dhall is not Turing complete. You can also freeze imports to prevent supply chain hacks, so it is much safer in theory.
They have a great writeup on their safety here: https://docs.dhall-lang.org/discussions/Safety-guarantees.ht...
Most of the time with configuration files, you want to be able to do at least basic validation on the config without having to run the whole piece of software. Python has no native typing, so there's nothing to stop the user from creating a configuration file where, for example, they have set a field that is intended to specify a port number, as a string. This error will become obvious when you run the software (and it presumably fails some way in) but for some use cases this is too late. With dhall you can specify that the field for the port number needs to not just be an integer, but be a natural number, so you're able to catch plenty of config errors nice and early, and can go some way to validating the config in isolation from the software it's intended to be used for.
I believe you may want a non-turing complete language for strict configuration.
But sometimes, as you indicate, you want VERY dynamic configuration. But I would argue then that such logic goes in your application itself and is not in fact part of your configuration.
Sometimes you just explicitly want to output text or a structured format, but you'd rather have a template language do it because it being a lot less powerful actually makes it easier to write close to the output format.
I really hope that one day Google can open source GCL, which so many languages have been inspired by without ever really being as good as it is. They all claim to address pitfalls in GCL but at the end of the day they're just worse than GCL.
Can you say more about what GCL does better than all of the open source ones?
Anecdotally, I've heard a lot of GCL horror stories, and many Xooglers have chosen to create things like Jsonnet or Skycfg (https://github.com/stripe/skycfg) instead.
It creates nested data without looking like the end format of the data (why I really can't get into jsonnet), it's just obviously its own language but is pretty minimal.
Despite the minimalism it is not necessarily simple. There are features like inheritance and late binding. They can be quite complicated, but the thing that really stands out about GCL is it's not complicated to use. The simple cases are easy, straightforward, and readable. The complicated cases are made possible.
I've used it extensively to store and transform data that has to be manually edited by humans. It's a really great format because I can define all the translation rules which can get pretty complicated, but the complexity I expose to the humans who are writing the data is minimal. But if they need to do some transformations on the data or even just, like, string substitutions...the language is there and they can use it. That's what makes writing your data in a configuration language nice.
(Honestly though I think jsonnet is at least vastly superior to skycfg and the whole starlark ecosystem. I swear I'm so sick of languages that intentionally look like Python without being actual Python!)
Isn't CUE basically that?
https://cuelang.org/docs/about/#history
Cue doesn't resemble GCL at all.
From that link:
> However, the early design of GCL went for something simpler that coincidentally was also incompatible with the notion of graph unification. This simpler approach proved insufficient, but it was already too late to move to the earlier foreseen approach. Instead, an inheritance-based override model was adopted. Its complexity made the earlier foreseen tooling intractable and they never materialized. The same holds for the GCL offsprings that copied its model.
Needless to say I disagree with the Cue author. I think the inheritance-based override model is fantastic and has made for a great and straightforward configuration language.
And yet, even if they did it right now, it'll be over there in the pile of "yet another configuration language" for the same reason we already have like 50 of them
Damn network effect, ruining everything :-/
Why not instead use one of the popular programming languages (C#, Java, ...), possibly with some dedicated library built on top of it? Also, reminds me of the Heptagon of Configuration [1].
[1] https://matt-rickard.com/heptagon-of-configuration