Comment by 0xbadcafebee

3 years ago

Python's .netrc library also hasn't supported comments correctly for like 5 years. The bug was reported, it was never fixed. If I want to use my .netrc file with Python programs, I have to remove all comments (that work with every other .netrc-using program).

It's 2022 and we can't even get a plaintext configuration format from 1980 right.

> It's 2022 and we can't even get a plaintext configuration format from 1980 right.

To me, it's more depressing that we've been at this for 50-60 years and still seemingly don't have an unambiguously good plaintext configuration format at all.

  • I've been a Professional Config File Wrangler for two decades, and I can tell you that it's always nicer to have a config file that's built to task rather than being forced to tie yourself into knots when somebody didn't want to write a parser.

    The difference between a data format and a configuration file is the use case. JSON and YAML were invented to serialize data. They only make sense if they're only ever written programmatically and expressing very specific data, as they're full of features specific to loading and transforming data types, and aren't designed to make it easy for humans to express application-specific logic. Editing them by hand is like walking a gauntlet blindfolded, and then there's the implementation differences due to all the subtle complexity.

    Apache, Nginx, X11, RPM, SSHD, Terraform, and other programs have configuration files designed by humans for humans. They make it easy to accomplish tasks specific to those programs. You wouldn't use an INI file to configure Apache, and you wouldn't use an Apache config to build an RPM package. Terraform may need a ton of custom logic and functions, but X11 doesn't (Terraform actually has 2 configuration formats and a data serialization format, and Packer HCL is different than Terraform HCL). Config formats minimize footguns by being intuitive, matching application use case, and avoiding problematic syntax (if designed well). And you'd never use any of them to serialize data. Their design makes the programs more or less complex; they can avoid complexity by supporting totally random syntax for one weird edge case. Design decisions are just as important in keeping complexity down as in keeping good UX.

    Somebody could take an inventory of every configuration format in existence, matrix their properties, come up with a couple categories of config files, and then plop down 3 or 4 standards. My guess is there's multiple levels of configuration complexity (INI -> "Unixy" (sudoers, logrotate) -> Apache -> HCL) depending on the app's uses. But that's a lot of work, and I'm not volunteering...

  • I quite like CUELang (https://cuelang.org/), although it not yet widely supported.

    It has a good balance between expressivity and readability, it got enough logic to be useful, but not so much it begs for abuses, it can import/export to yaml and json and features an elegant type system which lets you define both the schema and the data itself.

    I hope it gains traction.

  • toml is pretty much the best one I have seen so far. At least for small to medium size config files.

  • We do, it’s called TOML. The future is here it’s just not equally distributed.

    • TOML sucks for list of tables simply because they intentionally crippled inline tables to only be able to occupy one line. For ideology reasons ("we don't need to add a pseudo-JSON"). Unless your table is small, it's going to look absolutely terrible being all crammed into one line.

      https://github.com/toml-lang/toml/issues/516

      The official way to do list of tables is (look at how much duplication there is)

        [[main_app.general_settings.logging.handlers]]
          name = "default"
          output = "stdout"
          level = "info"
      
        [[main_app.general_settings.logging.handlers]]
          name = "stderr"
          output = "stderr"
          level = "error"
      
        [[main_app.general_settings.logging.handlers]]
          name = "access"
          output = "/var/log/access.log"
          level = "info"
      

      vs

        handlers = [
          {
            name = "default",
            output =  "stdout",
            level = "info",
          }, {
            name = "stderr",
            output =  "stderr",
            level = "error",
          }, {
            name = "access",
            output =  "/var/log/access.log",
            level = "info",
          },
        ]
      

      I would still reach for TOML first if I only needed simple key-value configuration (never YAML), but for anything requiring list-of-tables I would seriously consider JSON with trailing commas instead.

      7 replies →