Comment by aidenn0
6 years ago
If that's the case then both membership and non-membership will be probabilistic, rather than one or the other being certain like boom filters, right?
6 years ago
If that's the case then both membership and non-membership will be probabilistic, rather than one or the other being certain like boom filters, right?
"Membership test: returns true if the key x is likely in S, false otherwise"
I think they circumvent the problem by making a complicated insertion procedure where all of the keys need to be inserted at once, it seems uses a stack to avoid hash collisions
Take a look at https://github.com/FastFilter/xorfilter/blob/master/xorfilte...
I can usually read go, but I've never been that great at deciphering "academic code" so the exact details escape me here.
Yeah, TFA says it's intended as an immutable structure that has to be rebuilt to add new data to it - which limits its applications.
Yep, also the insertion procedure is retried with increasingly larger arrays until no “collisions” are encountered.
Reminds me of minimal perfect hash construction.
Yes, the underlying theory is from a minimal perfect hash function called "BDZ": http://cmph.sourceforge.net/bdz.html . Yes, sometimes a retry is needed, but the array is not made larger; it just retries with a different seed.