Comment by saisrirampur

1 day ago

Thanks for posting this question! Compared to Snowflake and Databricks, a few key differences in our approach are:

(a) An initial focus on real-time, customer-facing applications rather than trying to boil the ocean. This also aligns with where the Postgres + ClickHouse combination has really shined for our users. Both Postgres and ClickHouse are designed primarily with developers building their system of record applications.

(b) Every component in the stack is open source—Postgres, ClickHouse, PeerDB for native CDC, pg_clickhouse, and Ubicloud Postgres (our data plane component). We plan to keep it that way as much as possible, as this strongly aligns with our ethos.

(c)Third, as you noted, Postgres is NVMe-backed and the focus is on performance and scalability, while maintaining top-notch reliability. We think that this more meaningful to fast-growing (AI-driven) workloads than instant provisioning and forking. I talk about this a bit more here - https://clickhouse.com/blog/postgres-managed-by-clickhouse#p...

Thanks! Out of curiosity, does the NVME have a big effect on replication throughput? I've been wondering how much trouble I've had with other solutions is due to parsing WAL and how much is just slow cloud disk

  • Very interesting question. Depends on the use-case, have seen quite a few workloads where logical replication gets throttled on I/O (reorder buffer) where NVMe based disk access should help a lot. This happens specifically when there are larger or interleaved transactions. We plan to test this at production scale soon. Stay tuned for more learnings!

Is it a cost disadvantage for being NVMe-backed ?

  • Great question! It really depends on the workload. We already support NVMe instances as small as 4 GB RAM / 2 vCPUs. For HA setups, you could go with one standby (with configurable synchronous replication) or two standbys (cross-AZ, with quorum-based replication). So yes, there is some additional cost from a hardware perspective due to the standbys, but depending on the workload, NVMe performance could offset those costs. On top of this, there’s a separate topic around the reliability/availability promises of separating storage and compute for an OLTP Postgres database.