Comment by Joker_vD

10 months ago

Nope, the actual ssh implementation parses all the config files once, at the startup, using buffered file I/O and getline(). That means that on systems with modern libc, the whole config file (if it's small enough, less than 4 KiB IIRC?) gets read into the RAM and then getline() results are served from that buffer.

The scheme you propose is insane and if it was ever used (can you actually back that up? The disk I/O would kill your performance for anything remotely loaded), it was rightfully abandoned for much faster and simpler scheme.

2 comments

Joker_vD

ajross 10 months ago

> getline() results are served from that buffer.

So... it doesn't parse them once! It just does its own[1] buffering layer and implements... exactly the algorithm I described? Not seeing where you're getting the "Nope" here, except to be focusing on the one historical note about RAM that I put in parentheses.

[1] Somewhat needless given the OS has already done this. It only saves the syscall overhead.

Joker_vD 10 months ago
Sorry, my comment was intended to be reply to the one of yours that said
I/O is done piecewise, a line at a time. The file is never "loaded up". Again you're applying an intuition about how parsers are presented to college students (suck it all into RAM and build a big in-memory representation of the syntax tree) that doesn't match the way actual config file parsers work (read a line and interpret it, then read another).
So, the whole file is usually loaded up (if it's short enough). At this point you might as well parse all of it, instead of re-reading it from the disk over and over, and redoing the same work over and over; parsed configs — if they are parsed into falgs/enums, and not literal strings — usually take about the same, or less, memory than a FILE structure from libc does on the whole.
The complexity of the algorithm is about the same, either the early exit is here or it isn't (in fact, the version with the early exit, now that I think of it, has larger cyclotomatic complexity but whatever).