← Back to context

Comment by amluto

6 months ago

I find the unordered_map example rather amusing. C++’s unordered_map is, somewhat infamously, specified in an unwise way. One basically cannot implement it with a modern, high performance hash table for at least two reasons:

1. unordered_map requires some bizarre and not widely useful abilities that mostly preclude hash tables with probing:

https://stackoverflow.com/questions/21518704/how-does-c-stl-...

2. unordered_map has fairly strict iteration and pointer invalidation rules that are largely incompatible with the implementations that turn out to be the fastest. See:

> References and pointers to either key or data stored in the container are only invalidated by erasing that element, even when the corresponding iterator is invalidated.

https://en.cppreference.com/w/cpp/container/unordered_map

And, of course, this is C++, where (despite the best efforts of the “profiles” people), the only way to deal with lifetimes of things in containers is to write the rules in the standards and hope people notice. Rust, in contrast, encodes the rules in the type signatures of the methods, and misuse is deterministically caught by the compiler.

Like std::vector, std::unordered_map also doesn't do a good job on reservation, I've never been entirely sure what to make of that - did they not care? Or is there some subtle reason why what they're doing made sense on the 1980s computers where this was conceived?

For std::vector it apparently just didn't occur to C++ people to provide the correct API, Bjarne Stroustrup claims the only reason to use a reservation API is to prevent reference and iterator invalidation. -shrug-

[std::unordered_map was standardised this century, but, the thing standardised isn't something you'd design this century, it's the data structure you'd have been shown in an undergraduate Data Structures class 40 years ago.]

  • > For std::vector it apparently just didn't occur to C++ people to provide the correct API, Bjarne Stroustrup claims the only reason to use a reservation API is to prevent reference and iterator invalidation. -shrug-

    Do you mean something like vector::reserve_at_least()? I suppose that, if you don’t care about performance, you might not need it.

    FWIW, I find myself mostly using reserve in cases where I know what I intend to append and when I will be done appending to that vector forever afterwards.

    • I'm not familiar with vector::reserve_at_least but assuming that's an API which reserves capacity without destroying the amortized constant time of the exponential growth built in to the type, yes, that.

      1 reply →