← Back to context

Comment by BenFrantzDale

4 years ago

All of the C++ algorithms list complexity guarantees, I believe. This saga stunned me to learn that C doesn’t seem to do this.

It's easy to forget that the original C standards were largely codifying existing practice during an era when using gets() [1] was existing practice. The world wasn't quite ready for Ada, I guess. Best-laid plans of mice and men etc. etc..

Also, keep an eye out for "amortized" complexity. This does have a legitimately rigorous definition, but for latency-bound paths it can practically amount to "O(whatever), except for the particular invocations that are far, far worse under unspecified conditions".

[1] https://www.man7.org/linux/man-pages/man3/gets.3.html#BUGS

  • It's also easy to forget that C was competing mainly with assembly, while C++ competed with managed languages. The early C programmer ethos, especially among library authors, was much more along the lines of "look at the generated object code if you want to know what it's doing" while modern practice leans more towards "read the documentation for complexity guarantees". I'm not saying that worse documentation leads to better programmers, but I'm not not saying that either. Practices change, standards change.

    • Good documentation and inspecting the compiled bytecode are both good ways of finding out about performance characteristics of certain features. The problem starts when people rely on assumptions ("sscanf should be fast because it's widely used") or performance folklore ("localizing every function you'll ever use makes your Lua code faster"), because those tend to either be completely wrong or lack very important context.

      3 replies →

  • It makes me chuckle when hash maps are stated to be O(1) insertions. Which is true, in respect to the number of items in the map, assuming the map doesn't need resizing and there isn't a hash collision... but it's generally not true in respect to the key length. (I think most implementations are O(ln), where l is the length of the key and n is the number of inserted items, assuming the hash function is O(l) - the _amortised_ runtime would be O(l))

    • > assuming the map doesn't need resizing

      This isn't a big difficulty; it's still amortized O(1).

      > and there isn't a hash collision

      This is a real difficulty, unless you allow map resizing. Luckily, we do.

      > but it's generally not true in respect to the key length.

      OK, but in most cases the key length is constant, making anything that depends on the key length O(1) by definition.

      25 replies →

    • But, isn't the key length a constant and we are back to O(1)? Ok, in theory you could exhaust all possible keys of a certain length and proceed with longer keys. It would give us what? O(ln(n))?

      1 reply →

  • ...I fully plan to use "O(whatever)". Not sure for what.

    But, yes. (naive) Quicksort's amortized complexity being O(nlogn), but its O(n^2) on already sorted data, is all I ever needed to learn to take away that lesson. When sorting already sorted data is worse than sorting randomized data, it's a quick realization that "amortized cost" = "read the fine print".

Are there complexity guarantees for std::istream::operator>>?

That's because C is just a wrapper for machine code on a cheap PDP-11, and cheap PDP-11s didn't have enough RAM to do complexity.