← Back to context

Comment by procaryote

2 days ago

Caveat that this is new to me and I might get it wrong

It seems to be a way to arrange a hashtable so you use most of the hashed value (H1) to identify a range of cells in your table, and the rest of the hashed value (H2) as metadata. In the metadata for a cell it's tagged as empty/full/deleted and a full cell is also tagged with its H2 value

So when looking up things you find the range with H1, then you check H2 against the metadata for that range, to see which of the cells might contain the right value.

I imagine the benefit is that checking H2 against warm metadata is cache friendly and can use vector instructions, whereas with regular double hashing you'd need to lookup cold memory to check the key, and then perhaps do it again a couple of times

Do correct me if I am wrong