Comment by delta_p_delta_x

2 days ago

> anything serious, i.e. high throughput, low frequency, massively concurrent work

Why is only 'high throughput, low frequency, massively concurrent work' considered 'serious'?

16 comments

delta_p_delta_x

They're just clearly inferior in pretty much any situation.

The map stuff the other posters summed up well but even std::vector is dogshit with pretty much all implementations having inlined grow code in push_back, a not too great API and missed optimisations e.g. no trivial relocation when growing the vector / moving it and no useful APIs such as "grow but don't initialise"...

jstimpfle 2 days ago
To be fair grow-but-don't-initialize is a pretty fundamental part of the API, the reserve() method.
But already the basic premise that you should push back without thinking is wrong. You will suffer reallocations and invalisations when you least expected them, and frankly you have to architect around that fact which is a terrible restriction. You can work around by pre reserving but at that point it's just a basic fixed heap allocated array but worse because the type gives you a weird look all the time, "I'll realloc as soon as you don't pay attention, harhar"!
- tialaramex 2 days ago
  
  std::vector lacks what I call the "bifurcated reservation API". It has Rust's Vec::reserve_exact but not Vec::reserve -- these APIs serve subtly different purposes, the former (which C++ calls just "reserve") says "Here's a hint for exactly how big this container will ever grow" while the latter says "Here's a hint for how big this container will get in my immediate future, but it might grow further later".
  The implementation always tries to grow (if necessary) to the exact size chosen for Vec::reserve_exact, but for plain Vec::reserve if growth is needed it always grows exponentially, not to the exact size, preserving the O(1) push cost.
  For a typical "doubling" growable array type, if we're pushing groups of ten items, reserve_exact or C++ grows like 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 ... which is much worse than O(1) whereas the correct reserve grows 10, 20, 40, 80, 160 preserving O(1)
  For trivial types you can work around this in C++ with a little work, and for the non-trivial types you can work around it with a bunch more extra code, but you probably won't.
  Bjarne among other people teaching C++ recommend just not using the reservation API as a reservation API because of this problem†, and the resulting teaching definitely leaks into CS graduates and even into languages which have the correct API and so you have to un-teach the bad lesson.
  In applications where you actually can't afford to pay for growth (or at least in some cases can't afford this) I also like Vec::push_within_capacity which I believe comes from Rust for Linux where the kernel legitimately needs this "If there's room push, otherwise I have a plan B" approach.
  † To Bjarne this API is instead conceived of as a way to preserve reference validity. Since it won't grow, our references will still work.
  
  9 replies →

jstimpfle 2 days ago

You are free to make your own definition, what are your suggestions?

(Obviously I meant to say low latency, not low frequency)

delta_p_delta_x 1 day ago
That is a cop-out. You made the assertion, you define it.
What about these particular workloads (and the environments they're used in) make them 'serious' and why are other workloads 'lesser' and therefore the standard library 'suffices'? Why not use better containers for everything? Google, for instance, universally recommends Abseil.
- jstimpfle 1 day ago
  
  I stand by my statement. You are welcome to contend it, but then please actually suggest alternative workloads that could qualify as serious?
  Google also recommends using Golang in many cases, which was explicitly designed for not-so-experienced people. It is more geared towards creating services quickly and robustly, not towards squeezing out the last 10x of performance.
  I can't say anything about abseil, having never used it. "By Google" is not an immediate seal of quality, and it doesn't mean that one size fits all. I've recently seen a list of .so objects required in order link grpc (also by Google), there were like dozens of abseil .so's in the list IIRC. I don't like it! In my experience, validated countless times in my own practice, complexity => slow and hard to maintain and simplicity => fast and easy to change.
  Templatized C++ containers are generic recipes, they can't make use of local context and hence don't lead to the simplest solutions. They produce tons of boilerplate. They give you a decent single-threaded baseline performance in many cases, sure, but now start hammering these data structures using 16 CPUs in parallel... you might find that you should have designed your application completely differently.
  Like WalterBright says, I too use very simple data structures almost exclusively (arrays, linked lists, linear allocation). Most of these coded ad-hoc everywhere, very straightforward, very adaptable. Even hash maps I only use infrequently -- if I can, I just make keys such that I can index directly. I get performance from knowing exactly what happens, and minimize the work that needs to happen. I create immutable data objects (write-once) where possible. I've created my own pooled block-allocator for power of 2 sized allocations with book-keeping in shadow-memory (preventing fragmentation), massively reducing syscalls that modify virtual memory mappings and closely controlling memory used by various subsystems.
  I don't expect a library like abseil to solve many of my problems, it probably makes some things "easy" by taking away the very control from me that I need to solve the problem I have, resulting in unsolved problems and complexity.

miroljub 2 days ago

Because if you don't need any of these, any slop implementation will do.