Comment by hodgesrm
5 days ago
RMT does not depending on background merges completing to give correct results as long as you use FINAL to force merge on read. The tradeoff is that performance suffers.
I'm a fan of what you are trying to do but there are some hard tradeoffs in dedup solutions. It would be helpful if your site defined exactly what you mean by deduplication and what tradeoffs you have made to solve it. This includes addressing failures in clustered Kafka / ClickHouse, which is where it becomes very hard to ensure consistency.
No comments yet
Contribute on Hacker News ↗