Comment by twotwotwo
3 days ago
There is a largish category of tools now where, unlike in OLTP systems, a big focus is scanning data but quickly (O(n) but with a good constant): Redshift, Trino/Athena, ClickHouse, DuckDB among others.
Bloom filter indexing seems like a great fit if you ever need to do substring searches in a context like that, and for log searching in general. I haven't dug into what all packages have it, but it looks like at least ClickHouse does: https://clickhouse.com/docs/optimize/skipping-indexes#bloom-...
No comments yet
Contribute on Hacker News ↗