Comment by benjaminwootton

2 years ago

This would be a shame and also a mistake in my opinion.

Clickhouse is instantly differentiated from Snowflake, Databricks, BigQuery and RedShift with the open source offering that you can deploy yourself. There are lots of other options but Clickhouse has the most mindshare and is the techies choice.

I find myself rooting for them and recommending them for that before you even get into any technical comparison.

ClickHoues is also faster than any of them if you know how to use it properly. It helps if you have some distributed systems background and an intuitive feel for map/reduce.

For example ReplacingMergeTree uses a distributed algorithm to process changes without incurssing excessive INSERT time expense. It's quite elegant.

  • Insert should hav never been expensive in the first place. This was probably hard for clickhouse because they started with postgres as the base which is optimized for oltp. In apache Pinot/druid etc, insert is nothing more than a simple append and believe thats the case today with clickhouse as well... In other words, these things are table stakes today and are not differentiators.

    • This is a different problem. Update is expensive in distributed columnar data. ReplacingMergeTree translates updates into inserts which are very fast and always have been. It then updates rows in a lazy fashion.

All the main players in Clickhouse's space like Apache Pinot, Apache Druid, StarRocks, PrestoDB all have mindshare and unicorns using their products. It sounds like you haven't seen whats happening in this space.