← Back to context

Comment by londons_explore

3 days ago

What are they running this code on?

I doubt their hardware is any faster shuffling bits in a uint32 than a uint64, and using uint64 should have a decent benefit to the false positive rate...

It was just an oversight. Thinking about it now, u64 could have been better:

1. intra element collison goes down: 1/32 = 3.1% vs 1/64 = 1.6% -> 1.5% difference. Intra element collison doesn't mean guaranteed a FP though! 2. 64 bucket would have less variance - filter behavior would be more predictable

But there is a downside: When switching from 32 to 64 bit word we also reduce number of array elements 2 times, doubling the contention. We are populating the filter from up to 128 threads during join phase. When the build side isn't significantly smaller than the probe side, that contention can overweight the FP improvement.

That struck me as an odd choice, too. On average there's no difference in false positives, but the smaller the blocks, the more likely they'll be saturated. Since there are 6 leftover bits in the hash anyway, there's no cost to increase the two 5-bit values to 6 bits and the block size to 64. You'll have a lot fewer hot blocks that way.

With blocks this small there's also no reason not to optimize the number of hash functions (albeit this brings back the specter of saturation). There are no cache misses to worry about; all positions can be checked with a single mask.