Comment by jstimpfle
1 day ago
There are many many situations where I know exactly what is the common path and what is the uncommon path. And when I don't know, it's much harder to optimize!
If you're counting pennies, Vec::reserve() (inexact) is hardly what you want, because in the worst case you're wasting a factor of 1.5x or 2x of elements due up-front overallocation. Maybe chunk lists could be better, overhead is bounded by chunk size and all operations are constant-time. No pointer invalidation either. And you can pool those chunks, preventing memory fragmentation and improving memory utilization, since there aren't a million different sized allocations in your process.
> If you're counting pennies, Vec::reserve() (inexact) is hardly what you want, because in the worst case you're wasting a factor of 1.5x or 2x of elements due up-front overallocation.
I think this is "Trump math" but assuming we're actually looking at the same over-allocation this isn't new, what you're calling "wasting a factor of 2x" was already what you were paying by using the amortized O(1) growable array type, that's its central trade off, so yeah, the type with that property does still have that property.
> Maybe chunk lists could be better, overhead is bounded by chunk size and all operations are constant-time.
Basically std::deque. Surely no more need be said ?
> was already what you were paying by using the amortized O(1) growable array type
As I said, I don't think growable array data structure is a good idea.
> Basically std::deque. Surely no more need be said ?
Ok now you're really out of your depth here. Pretty sure you haven't actually used it, it's a somewhat obscure data structure even though the more serious C++ programmers have heard of it, most have probably never used it.
From a web search, std::deque seems to be pretty universally thought of as a weird and very bad data structure. I've had to learn myself because I did actually try to use it myself recently. Beyond being unintuitive to use due to being "abstract" (had 2-3 serious bugs, e.g. unexpected iterator invalidations that happened only under load), apparently std::deque is not even specified as a chunk list. O(1) random access must be guaranteed. So it must be much more complicated than that, I don't know the details though.
And while actual implementations do use chunks in some way (just not in a simple chunk list), apparently "chunks" is not a part of the std::deque spec, and hence there isn't anything standardized about the size of these chunks either.
- On MSVC, chunk size is 16 elements so chunk size of std::deque char is 16 BYTES !!!!! I was thinking they can't be serious, but apprently it is the case
- On GCC, chunk size is 512 byte fixed.
- On Clang, chunk size is 4096 byte fixed.
It _is_ a shithole, surely no more need be said?
> As I said, I don't think growable array data structure is a good idea.
It is still quite hard for me to take this seriously as a belief.
> Ok now you're really out of your depth here.
I don't think so. As you found, the consensus is that this sort of thing is a terrible idea. The O(1) complexity looks good superficially but your actual performance is miserable.
> I don't know the details though.
Raymond Chen has a pretty good explanation of the three implementations of std::deque, like he did for their std::string but Raymond's goal is to help you do forensics, he's not here to tell us which data structures are a good idea. https://devblogs.microsoft.com/oldnewthing/20230810-00/?p=10... might work, or, given Microsoft, it might already be dead by the time you click it.
3 replies →