← Back to context

Comment by devoxi

2 years ago

They are both columnar data stores and while they solve the same problem I wouldn't use them in the same situation. DuckDB is often referred as the sqlite of analytics, meaning that it's lightweight and you can embed it. On the other hand ClickHouse is definitely the way to go if you need to distribute your queries over multiple servers. If your workload can be held on a single server and you only need standard SQL functions both will serve you well. If you have more specific needs maybe you should have a look at the documentation. For example ClickHouse has a very extensive support for nested arrays which can prove quite useful.

Duckdb has also gotten mindshare as an engine to read Parquet from data lakes. The fact that it's embeddable enables some very creative uses. It helped that for a time DuckDB was substantially quicker than ClickHouse on reading Parquet. That advantage has eroded with recent improvements on ClickHouse Parquet support. I expect the gap will close quickly.