Comment by dang
8 hours ago
Related prior work:
Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters - https://news.ycombinator.com/item?id=21840821 - Dec 2019 (81 comments)
8 hours ago
Related prior work:
Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters - https://news.ycombinator.com/item?id=21840821 - Dec 2019 (81 comments)
See also the paper Ribbon filter: practically smaller than Bloom and Xor: https://arxiv.org/abs/2103.02515, which is a similar idea though not by the same authors.
IIRC, binary fuse filters are faster to construct than ribbon filters, but typically not quite as space-efficient. There are also frayed ribbon filters (by me) which are slower and more complex to construct but more space-efficient. There's no paper for those, just a Rust implementation.
Ribbon filters are deployed in Mozilla's Clubcard for distributing compressed certificate revocation lists: https://github.com/mozilla/clubcard and https://jmschanck.info/papers/20250327-clubcard.pdf. CRLs are an almost ideal application of this sort of compressed set tech, since the aggregator runs batch jobs and needs to distribute the set to very many clients. It's not perfectly ideal because CRLs require frequent updates and none of these methods support delta updates. There is a straightforward but inelegant workaround, which is to send a compressed set that represents the delta, and query both on the client.