Comment by duped
2 years ago
A lot of cache oblivious structures start with a BST laid out in a flat array, which is actually more cache friendly than a naive B tree. But not many people do that in practice, and also require keeping it balanced, so B trees are fine.
BSTs are also useful as an analytical tool because they're just B trees with B = 2, so a lot of the insight into their structure and algorithms can be extrapolated to N-ary search trees aka B trees
by 'bst laid out in a flat array' i am guessing you mean a layout like a traditional binheap, so, for the keys 1 2 3 4 5 6 7
is that right, and do you mean to make this arbitrarily large
i feel like this is less cache friendly than a naive b-tree, even without rebalancing; if your cache line size is 128 bytes, your keys are 4 bytes, and your b-tree nodes are 15 keys and 16 (4-byte!) pointers, you can reach any of 1048576 keys in 5 cache-line fills, of which probably 2 were already in your cache so it's really 3. by contrast, the flat-array binary search tree above is 20 levels deep. the first 5 levels are in one cache line (because you don't waste any space on pointers) but every key in the following 15 levels of the tree is in its own cache line so you have probably 14 cache-line fills
14 is like a lot more than 3 in my mind maybe
conceivably you have done this in practice and can tell me what i'm overlooking
I believe yes, as illustrated in [1]. This idea only works in a static sense, because to insert a value suffers from the same linear movement of memory as dynamic arrays. B-trees are somewhere in-between because they support logarithmic insert/split/join and make better use of cache than BSTs, a well known fact.
I tried to focus specifically on BSTs in the context of online persistence and concurrency, where they are particularly effective at very large sizes. There is a section on the paper that mentions B-trees but I did not want to include a direct comparison as part of the scope. Take the best candidate from this set, and compare against a good B-tree implementation in whatever situation you might require them.
[1] https://algorithmica.org/en/eytzinger