Comment by thomasmg

6 years ago

Very interesting! If both construction and lookup need to be very fast, and if you do have enough memory, then I would say a blocked Bloom filter would be best (without knowing your use case in detail). With a regular Bloom filter, you will have lots of CPU cache misses. Cuckoo and xor filter might make sense if you can do (batch) construction in background threads, multiple filters concurrently. That way you can save memory.

The filters are used to seek from database files, so lookup/query side performance is not major issue (Xor filters were mainly interesting because of the smaller size). Normally a secondary index would do the job fine, but because of the high sustained rate there is no time to write it. And all CPU cores are already maxed out, so moving more to background doesn't provide net-benefit..

+Thanks everybody for your encouraging words :)