Comment by atwong

2 years ago

Lots of other competitors like Apache Pinot, Apache Druid and StarRocks are fighting in that default analytics space.

StarRocks has compute/storage separation in open-source as example.

  • When datasets are small and can easily fit into a single node [a few terabytes], this isn't as much of an issue. Yet when datasets grow far larger, or when compute/QPS needs grow while the dataset grows slower — when either side of the equation does not scale in balanced proportion with each other — that's when this separation of compute & storage becomes vital. [Either that, or you need to find hardware servers or cloud instance types that also support this imbalance of compute & storage, which is sometimes harder to do; it also locks you into a hardware configuration that cannot dynamically scale as needs and workloads change.]

    Apache Pinot also offers the same 2-tier compute/storage separation. And it also has nodes for minion [administrative] tasks. Again, these are more issues for larger scale analytical use cases.

    • > Apache Pinot also offers the same 2-tier compute/storage separation.

      Based on looking at the docs, I don't think so. Maybe only with HDFS. Feel free to link to a page that says otherwise.

      1 reply →

  • Not only. Transactions, UPDATES, CBO, Better join optimizations.

    It seems that someone is stuck in 2016, when there is no good alternatives for ClickHouse exist in open source.