Comment by the8472

6 years ago

One of the advantages of bloom filters is that you can also do intersections, unions and cardinality estimations thereof on remote sets. These don't seem to support that.

Yes it is possible to do this, but do people actually use those features? For (pure) cardinality estimation, there is HyperLogLog, which can also be merged...

  • > do people actually use those features?

    I did, not sure how common it is though. For much-larger-than-memory problems it can give a tremendous speedup to be able to do an estimate or logical operation(s) before having to hit the disk.

    • I used that to sum unique visitors on websites, by summing daily bloom filters corresponding to the set of visitors each day. It was pretty cool to be able to sum "unique" metrics, which is usually not possible in classic reporting tools.