← Back to context

Comment by colinchartier

6 years ago

Direct link to paper (pdf) - https://arxiv.org/pdf/1912.08258.pdf

I haven't fully read the paper yet, but it seems like a variation of a bloom filter where membership is determined by xoring three hashes (see algorithm 1 in the pdf)

If that's the case then both membership and non-membership will be probabilistic, rather than one or the other being certain like boom filters, right?

  • "Membership test: returns true if the key x is likely in S, false otherwise"

    I think they circumvent the problem by making a complicated insertion procedure where all of the keys need to be inserted at once, it seems uses a stack to avoid hash collisions

    Take a look at https://github.com/FastFilter/xorfilter/blob/master/xorfilte...

    I can usually read go, but I've never been that great at deciphering "academic code" so the exact details escape me here.

    • Yeah, TFA says it's intended as an immutable structure that has to be rebuilt to add new data to it - which limits its applications.

    • Yep, also the insertion procedure is retried with increasingly larger arrays until no “collisions” are encountered.

      Reminds me of minimal perfect hash construction.

      1 reply →