← Back to context

Comment by ztorkelson

8 years ago

Would you mind elaborating? I'm not quite sure what you mean by segmented here.

I'm not thecompilr, but if I understand them correctly:

In the worst case, the cache behaviour of a radix tree could be awful, if each node lives in a different cache-line, no?

I imagine this problem can be ameliorated with a custom allocator, though.

Been a while since I've studied hash-maps in depth, but with a good hash function, that wouldn't be a big worry, would it?

  • Gotcha. Yes, while there are implementation techniques that can help (e.g. arena allocation, prefix compression, adaptive node sizes), the data structure is fundamentally a tree, and so one can expect a higher number of random memory accesses (when compared with suitably optimized hash tables, like those discussed in the article).

    In some scenarios, radix trees can overcome that with better space efficiency (i.e. implicit key storage), but I think it's fair to say that there are many scenarios (if not most!) where hash tables outperform radix trees.

    Of course, that all hinges on how you measure performance. As Salvatore pointed out, if you're looking to minimize variance, hash tables might not be suitable due to their growth characteristics. Radix trees also give you efficient sorting and range scans out of the box, and are a viable immutable persistent data structure as well. If none of those are meaningful to your use cases, then hash tables could certainly be a better fit.

    • Arena-allocation? As in (roughly speaking) a Stack<Node> data structure whose lifetime is bound to the lifetime of the tree, right?

      How could we support deletion of nodes?

      5 replies →

  • Thank you. Exactly. Even worse each node could live in a different page. Hash maps can span multiple pages, but the lookup should succeed in 1 or 2 tries for a good hash function.