Comment by ajross
3 months ago
> hard to believe RAM considerations was the reason for the “first match” design of sshd’s config
And I repeat: first match involves less code. It's a simpler design. The RAM point was an interesting digression, I literally put it in parentheses!
I don’t think it does require less code. I don’t think it requires more code either. It’s just not a fundamental code change.
The difference is just either: overwriting values or exiting in the presence of a match. Either way it’s the same parser rules you have to write for the config file structure.
OK, but now that's a performance regression. The assumption upthread was that the whole file needed to be parsed into an in-memory representation. If you don't do that, sure, you can implement precedence either way. But parsing all the way to the end for every read is ridiculous. The choice is between "parse all at once", which allows for arbitrary precedence rules but involves more code, and "parse at attribute read time", which involves less code but naturally wants to be a "first match" precedence.
As someone who’s written multiple parsers, I can tell you that having a parser that stops upon matched condition requires a lot more careful thought about how that parser is called in a reusable way while still allowing for multiple different types of parameters to be stored in that configuration.
For example:
- You might have one function that requires a little of all known hosts (so now your “stop” condition isn’t a match but rather a full set)
- another function that requires matching a specific private key for a specific host (a traditional match by your description)
- a third function that checks if the host IP and/or host name is a previously known host (a match but no longer scanning host names, so you now need your conditional to dynamically support different comparables)
- and a forth function to check what public keys are available which user accounts (now you’re after a dynamic way to generate complete sets because neither the input nor the comparison are fixed and you’re not even wanting the parser to stop on a matched condition)
Because these are different types of data being referenced with different input conditions, you then need your parser to either be Turing complete or different types of config files for those different input conditions thus resulting in writing multiple different parsers for each of those types of config (sshd actually does the latter).
Clearly the latter isn’t simpler nor less code any more.
If you’re just after simplicity from a code standpoint then you’d make your config YAML / JSON / TOML or any other known structured format with supporting off-the-shelf libraries. And you’d just parse the entire thing into memory and then programmatically perform your lookups in the application language rather than some config DSL you’ve just had to invent to support all of your use cases.
You introduced an n^2 config algorithm but now you're worried about the much smaller performance impact from going to the end of the file instead of sometimes stopping halfway?
And for the record I'm not convinced your way is simpler. The code gets sprinkled with config loading calls instead of just checking a variable, and the vast majority of the parser is the same between versions.
2 replies →