I recommend the zig library [1], it was a joy to use. Bloom filters was one of the first interesting algorithms I did in class back in university, we upgraded hardware during the lab making the use of bloom filters unnecessary in a lab ment to interactively show its usefulness. I have had this repeated since then, these filters are magic until hardware catches up, having smaller filter is lovely.
Fast, but not faster than XOR filters. I was wondering if the title was a typo, but the article clarifies they sacrificed some speed for the smaller size.
One construction is smaller and faster to build than xor filters, and another is even more compact, though slower:
> we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
> that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
They are faster to construct (2x) and smaller (within 13% of theoretical limit) while maintaining the same query performance.
Related prior work:
Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters - https://news.ycombinator.com/item?id=21840821 - Dec 2019 (81 comments)
I recommend the zig library [1], it was a joy to use. Bloom filters was one of the first interesting algorithms I did in class back in university, we upgraded hardware during the lab making the use of bloom filters unnecessary in a lab ment to interactively show its usefulness. I have had this repeated since then, these filters are magic until hardware catches up, having smaller filter is lovely.
[1] https://github.com/hexops/fastfilter
Fast, but not faster than XOR filters. I was wondering if the title was a typo, but the article clarifies they sacrificed some speed for the smaller size.
One construction is smaller and faster to build than xor filters, and another is even more compact, though slower:
> we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
I think you misread:
> that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
They are faster to construct (2x) and smaller (within 13% of theoretical limit) while maintaining the same query performance.
This feels like a better xor filter implementation.
Same author.