Binary fuse filters: Fast and smaller than xor filters (2022)

5 days ago (arxiv.org)

I recommend the zig library [1], it was a joy to use. Bloom filters was one of the first interesting algorithms I did in class back in university, we upgraded hardware during the lab making the use of bloom filters unnecessary in a lab ment to interactively show its usefulness. I have had this repeated since then, these filters are magic until hardware catches up, having smaller filter is lovely.

[1] https://github.com/hexops/fastfilter

Fast, but not faster than XOR filters. I was wondering if the title was a typo, but the article clarifies they sacrificed some speed for the smaller size.

  • One construction is smaller and faster to build than xor filters, and another is even more compact, though slower:

    > we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.

  • I think you misread:

    > that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.

    They are faster to construct (2x) and smaller (within 13% of theoretical limit) while maintaining the same query performance.