Comment by pi_22by7
4 days ago
The key insight about bloom filters lacking synergy is excellent. The ~7K document crossover point makes sense because inverted indexes amortize dictionary storage across all documents while bloom filters must encode it linearly per document
But doesn’t that depend on the cardinality of the indexes versus the document count? I’ve seen systems with a stupid number of tag values.