Comment by adsharma
1 day ago
That's right - it was a fun 2 out of 3 analogy.
The real question being raised in the blog post is - should the next generation graph databases pursue a local-only embedded strategy or build on top of object storage like many non-graph and vector embedded databases are doing.
Specifically, DuckLake (using system catalog for metadata instead of JSON/YAML) is interesting. I became aware of Apache GraphAr (incubating) after writing the blog post. But it seems to be designed for data interchange between graph databases instead of designing for primary storage.
I only mentioned it because I clicked it wondering if someone had found a way to "cheat" CAP for graph databases. When I saw that it was being used as an analogy and not literally, I figured I'd comment.
I still don't quite get the analogy. What are the 2 out of 3 that you can have? The second paragraph I quoted gives a classic 1 out of 2 dilemma - either scalable _or_ open-source.
DuckDB is scalable (can handle TPC-H 1TB) and open source, but doesn't support graphs natively. It supports some graph queries on a SQL native columnar storage.
With the proposed solution, you'll be able to query larger graphs on an open source graph native engine. Thus beating the "CAP theorem for graphs".