I think it would be better to require quotation marks around all string values, in order to avoid this kind of problems. (It is not the only problem with YAML, but it is my opinion of how any format with multiple types should require explicitly mentioning if it is a string type, but YAML (and some other formats) doesn't.) (If keys are required to strings, then it can be reasonable to allow keys to be unquoted if the set of characters that unquoted keys can contain is restricted (and disallowing unquoted empty strings as keys).)
It might be that there’s some setting that fixes this or some better library that everyone should be switching to, but YAML has nothing that I want and has been a repeated source of footguns, so I haven’t found it worth looking into. (I am vaguely aware that different tools do configure YAML parsing with different defaults, which is actually worse. It’s another layer of complexity on an already unnecessarily complex base language.)
Because there's a metric ton of software out there that was built once upon a time and then that bit was never updated. I've seen this issue out in the wild across more industries than I can count.
This is a reference to YAML parsing the two letter ISO country code for Norway:
As equivalent to a boolean falsy value:
It is a relatively common source of problems. One solution is to escape the value:
More context: https://www.bram.us/2022/01/11/yaml-the-norway-problem/
I think it would be better to require quotation marks around all string values, in order to avoid this kind of problems. (It is not the only problem with YAML, but it is my opinion of how any format with multiple types should require explicitly mentioning if it is a string type, but YAML (and some other formats) doesn't.) (If keys are required to strings, then it can be reasonable to allow keys to be unquoted if the set of characters that unquoted keys can contain is restricted (and disallowing unquoted empty strings as keys).)
We stopped having this problem over ten years ago when spec 1.1 was implemented. Why are people still harking on about it?
Current PyYAML:
Other people did not stop having this problem.
It might be that there’s some setting that fixes this or some better library that everyone should be switching to, but YAML has nothing that I want and has been a repeated source of footguns, so I haven’t found it worth looking into. (I am vaguely aware that different tools do configure YAML parsing with different defaults, which is actually worse. It’s another layer of complexity on an already unnecessarily complex base language.)
1 reply →
A new spec version doesn’t mean we stop having the problem.
E.g. kubernetes wrote about solving this only five months ago[1] and by moving from yaml to kyaml, a yaml subset.
[1]: https://kubernetes.io/blog/2025/07/28/kubernetes-v1-34-sneak...
4 replies →
Because there's a metric ton of software out there that was built once upon a time and then that bit was never updated. I've seen this issue out in the wild across more industries than I can count.
3 replies →
Because once a technology develops a reputation for having a problem it's practically impossible to rehabilitate it.
Now add brackets and end-tags, I'll reconsider. ;)
1 reply →