I've been excited about Kuzu DB as a SQLite-style graph database. It looks like the devs are moving on to something else and no longer will support it, as of 10 October.
Their message reads, "Kuzu is working on something new! We will no longer be actively supporting KuzuDB. You can access the full archive of KuzuDB here: GitHub" https://github.com/kuzudb/kuzu
There was a recent VLDB paper[1] demonstrating that the extension DuckPGQ[2] for DuckDB (an embedded database) offers competitive graph query performance compared to Neo4j and Umbra. No data on how it compares to KuzuDB.
Note that the repo mentions "some of our resources are moving from our website to GitHub: "Docs: http://kuzudb.github.io/docs, Blog: http://kuzudb.github.io/blog" but those links currently redirect to kuzudb.com. I presume they won't be covering the domain name costs in the future and that the transition is in-progress.
Hi there, leading DuckPGQ developer here :) Thanks for the shoutout! I've been busy working on an internship at DuckDB labs so DuckPGQ has gotten less attention, but I'll get back to it soon (December most likely) and will update the extension to support DuckDB v1.4.0 and v1.4.1 this week hopefully.
PGQ requires you to write using SQL and read using a graph query language. GQL is a standalone language that supports reads/writes. But much of the community is still using cypher.
As far as I can tell, this has nothing to do with CAP theorem or distributed systems. It's just being used as an analogy.
> [CAP theorem] states that any distributed storage system can provide only two of these three guarantees: Consistency, Availability and Partition safety.
> In the realm of graph databases, we observe a similar “two out three” situation. You can either have scalable systems that are not fully open source or you can have open source systems designed for small graphs. Details below.
(the article follows)
> This is one solution to the CAP theorem for graphs. We can store a billion scale graph using this method in parquet files and use a free, cheap and open source solution to traverse them, perform joins without storage costs that are prohibitively high.
DuckPGQ is an interesting option, but unfortunately, that project hasn't been touched in a few months and does not currently work with the latest version of DuckDB.
A couple companies using Kuzu in products are talking about joining efforts on a community fork, including Gitlab and Kineviz. Possible future home of that work: https://github.com/Kineviz/bighorn
Strangely enough, it was just that day when I discovered this formidable embeddable graph database that the "archived" banner also appeared. Bummer. I wonder why they stopped as there was a long string of commits for years.
I use the Python Kuzu graph database library, super convenient for local experiments. I see no reason to stop using it. The underlying database is archived on GitHub so it isn’t going anywhere.
Rough news on kuzu being archived - startups are hard and Semih + Prashanth did so much in ways I value!
For those left in the lurch for compute-tier Apache Arrow-native graph queries for modern OSS ecosystems, GFQL [1] should be pretty fascinating, and hopefully less stress due to a sustainable governance model. Likewise, as an oss deeptech community, we add interesting new bits like the optional record-breaking GPU mode with NVIDIA Rapids [4].
GFQL, the graph dataframe-native query language, is increasingly how Graphistry, Inc. and our community work with graphs at the compute tier. Whether the data comes from a tabular ETL pipeline, a file, SQL, nosql, or a graph storage DB, GFQL makes it easy to do on-the-fly graph transforms and queries at the compute tier at sub-second speeds for graphs anywhere from 100 edges to 1,000,000,000 [3]. Currently, we support arrow/pandas, and arrow / nvidia rapids as the main engine modes.
While we're not marketing it much yet, GFQL is already used daily by every single Graphistry user behind-the-scenes, and directly by analysts & developers at banks, startups, etc around the world. We built it because we needed an OSS compute-tier graph solution for working with modern data systems that separate storage from compute. Likewise, data is a team sport, so it is used by folks on teams who have to rapidly wrangle graphs, whether for analysis, data science, ETL, visualization, or AI. Imagine an ETL pipeline or notebook flow or web app where data comes from files, elastic search, databricks, and neo4j, and you need to do more on-the-fly graph stuff with it.
We started [4] building what became GFQL before Kuzu because it solves real architectural & graph productivity problems that have been challenging our team, our users, and the broader graph community for years now. Likewise, by going dataframe-native & GPU-mode from day 1, it's now a large part of how we approach GPU graph deep tech investments throughout our stack, and means it's a sustainably funded system. We are looking at bigger R&D and commercial support contracts with organizations needing to do subsecond billion+-scale with us so we can build even more, faster (hit me up if that's you!), but overall, most of our users are just like ourselves, and the day-to-day is wanting an easy OSS way to wrangle graphs in our apps & notebooks. As we continue to smooth it out (ex: we'll be adding a familiar Cypher syntax), we'll be writing about it a lot more.
If I can't trust their first project (KuzuDB), then why on earth would I trust any subsequent project by them? I won't.
This is why I stick to SQLite or PostgreSQL when it comes to databases. An LLM can trivially write me the commonly necessary graph queries if I should need them.
My best guess is the company was acqui-hired and will soon be working on implementing Kuzu's tech in a different database owned by the acquirer.
My _hope_ is that it was some IP issue with the University of Waterloo and a new company will appear shortly and pretty much pick up where they left off, but that's probably just wishful thinking on my part.
How did you interpret this person comment as about being owed anything? It's simply a fact of life that it's not smart to put your eggs into an unstable basket.
Abandoned for a new project. Kuzu is Japanese for unwanted/useless scraps or garbage, so I suppose it's still living up to its name.
For anyone who's curious—the project was originally named after the Sumerian word for "wisdom"[1].
1. https://web.archive.org/web/20250318034702/https://blog.kuzu...
Wow the Japanese were pretty savage back then.
The kana spelling (which is what phonetically would sound like 'kuzu') can refer to either scraps/garbage, or the Kudzu plant.
[flagged]
Kuzu means sheep in turkish.
Not sheep, lamb.
1 reply →
I've been excited about Kuzu DB as a SQLite-style graph database. It looks like the devs are moving on to something else and no longer will support it, as of 10 October.
Their message reads, "Kuzu is working on something new! We will no longer be actively supporting KuzuDB. You can access the full archive of KuzuDB here: GitHub" https://github.com/kuzudb/kuzu
Oh too bad. Small fast embedded graph DBs are rare. Any good alternatives?
SurrealDB: https://surrealdb.com/
There was a recent VLDB paper[1] demonstrating that the extension DuckPGQ[2] for DuckDB (an embedded database) offers competitive graph query performance compared to Neo4j and Umbra. No data on how it compares to KuzuDB.
[1] https://vldb.org/cidrdb/papers/2023/p66-wolde.pdf [2] https://duckpgq.org/
There used to be a similarly names one called CozoDB[0] which was pretty awesome but it looks like its development significantly slowed down.
[0] https://github.com/cozodb/cozo
You should check FalkorDB https://github.com/falkordb/falkordb
Not embedded last I checked. Unless that changed.
Posted below: GFQL is also OSS and architecturally similar, though slightly different goals and features: https://news.ycombinator.com/item?id=45560036#45561807
Perhaps dgraph if using go. Or surrealdb, though it's the opposite of small - it's an all in one, do everything db. I'm excited to see how it matures
Fork it and organize support for it.
Someone linked to some duckdb extension above that shows some graph support.
Link to kuzudb internals:
https://kuzudb.com/docs/developer-guide/database-internal/
Note that the repo mentions "some of our resources are moving from our website to GitHub: "Docs: http://kuzudb.github.io/docs, Blog: http://kuzudb.github.io/blog" but those links currently redirect to kuzudb.com. I presume they won't be covering the domain name costs in the future and that the transition is in-progress.
https://duckdb.org/community_extensions/extensions/duckpgq.h...
Hi there, leading DuckPGQ developer here :) Thanks for the shoutout! I've been busy working on an internship at DuckDB labs so DuckPGQ has gotten less attention, but I'll get back to it soon (December most likely) and will update the extension to support DuckDB v1.4.0 and v1.4.1 this week hopefully.
PGQ requires you to write using SQL and read using a graph query language. GQL is a standalone language that supports reads/writes. But much of the community is still using cypher.
More on this here:
https://adsharma.github.io/beating-the-CAP-theorem-for-graph...
As far as I can tell, this has nothing to do with CAP theorem or distributed systems. It's just being used as an analogy.
> [CAP theorem] states that any distributed storage system can provide only two of these three guarantees: Consistency, Availability and Partition safety.
> In the realm of graph databases, we observe a similar “two out three” situation. You can either have scalable systems that are not fully open source or you can have open source systems designed for small graphs. Details below.
(the article follows)
> This is one solution to the CAP theorem for graphs. We can store a billion scale graph using this method in parquet files and use a free, cheap and open source solution to traverse them, perform joins without storage costs that are prohibitively high.
3 replies →
DuckPGQ is an interesting option, but unfortunately, that project hasn't been touched in a few months and does not currently work with the latest version of DuckDB.
Hi there, leading DuckPGQ developer here. I've been busy with other projects but will get back to it soon enough :)
gitlab just announced knowledge graph with kuzu db. i wonder how it will turns out
A couple companies using Kuzu in products are talking about joining efforts on a community fork, including Gitlab and Kineviz. Possible future home of that work: https://github.com/Kineviz/bighorn
can you share a link?
https://gitlab-org.gitlab.io/rust/knowledge-graph/getting-st...
Strangely enough, it was just that day when I discovered this formidable embeddable graph database that the "archived" banner also appeared. Bummer. I wonder why they stopped as there was a long string of commits for years.
I use the Python Kuzu graph database library, super convenient for local experiments. I see no reason to stop using it. The underlying database is archived on GitHub so it isn’t going anywhere.
One thing you might want to watch out for is that the storage format on disk is not stabilized.
Last few releases, you couldn't open a file written by a previous version of kuzu. You had to constantly export/import as new versions were released.
This is no longer a problem for kuzu because development has stopped. But any open source fork needs to think about how to stabilize storage.
In the past few releases kuzu switched from database as a directory to a single file database.
Reposting:
--
Rough news on kuzu being archived - startups are hard and Semih + Prashanth did so much in ways I value!
For those left in the lurch for compute-tier Apache Arrow-native graph queries for modern OSS ecosystems, GFQL [1] should be pretty fascinating, and hopefully less stress due to a sustainable governance model. Likewise, as an oss deeptech community, we add interesting new bits like the optional record-breaking GPU mode with NVIDIA Rapids [4].
GFQL, the graph dataframe-native query language, is increasingly how Graphistry, Inc. and our community work with graphs at the compute tier. Whether the data comes from a tabular ETL pipeline, a file, SQL, nosql, or a graph storage DB, GFQL makes it easy to do on-the-fly graph transforms and queries at the compute tier at sub-second speeds for graphs anywhere from 100 edges to 1,000,000,000 [3]. Currently, we support arrow/pandas, and arrow / nvidia rapids as the main engine modes.
While we're not marketing it much yet, GFQL is already used daily by every single Graphistry user behind-the-scenes, and directly by analysts & developers at banks, startups, etc around the world. We built it because we needed an OSS compute-tier graph solution for working with modern data systems that separate storage from compute. Likewise, data is a team sport, so it is used by folks on teams who have to rapidly wrangle graphs, whether for analysis, data science, ETL, visualization, or AI. Imagine an ETL pipeline or notebook flow or web app where data comes from files, elastic search, databricks, and neo4j, and you need to do more on-the-fly graph stuff with it.
We started [4] building what became GFQL before Kuzu because it solves real architectural & graph productivity problems that have been challenging our team, our users, and the broader graph community for years now. Likewise, by going dataframe-native & GPU-mode from day 1, it's now a large part of how we approach GPU graph deep tech investments throughout our stack, and means it's a sustainably funded system. We are looking at bigger R&D and commercial support contracts with organizations needing to do subsecond billion+-scale with us so we can build even more, faster (hit me up if that's you!), but overall, most of our users are just like ourselves, and the day-to-day is wanting an easy OSS way to wrangle graphs in our apps & notebooks. As we continue to smooth it out (ex: we'll be adding a familiar Cypher syntax), we'll be writing about it a lot more.
Links:
* ReadTheDocs: SQL <> Cypher <> GFQL - https://pygraphistry.readthedocs.io/en/latest/gfql/translate...
* pip install: https://pypi.org/project/graphistry/
* 2025 keynote - OSS interactive billion-edge GFQL analytics on 1 gpu: https://www.linkedin.com/posts/graphistry_at-graph-the-plane...
* 2022 blogpost w/ Ben Lorica first painting the vision: https://thedataexchange.media/the-graph-intelligence-stack/
With property graphs being adopting in the SQL standard, this isn’t surprising.
The fact that GQL is now supported by some of the relational Database, doesn't mean they'll become an alternative to native Graph Databases.
Yeah I guess it’s like saying that relational DBs supporting native JSON type meant the end of NoSQL DBs.
2 replies →
Yeah, so sad as a contributor and downstream user.
Hopefully they will ship cool new things.
If I can't trust their first project (KuzuDB), then why on earth would I trust any subsequent project by them? I won't.
This is why I stick to SQLite or PostgreSQL when it comes to databases. An LLM can trivially write me the commonly necessary graph queries if I should need them.
My best guess is the company was acqui-hired and will soon be working on implementing Kuzu's tech in a different database owned by the acquirer.
My _hope_ is that it was some IP issue with the University of Waterloo and a new company will appear shortly and pretty much pick up where they left off, but that's probably just wishful thinking on my part.
Why does an MIT-licensed open source project owe you anything whatsoever?
How did you interpret this person comment as about being owed anything? It's simply a fact of life that it's not smart to put your eggs into an unstable basket.
It's not about what is owed; it's about what can be trusted. The people behind Kuzu have shown that they cannot be trusted to be used.
1 reply →
Kuzudb was actively working on their cloud/enterprise solution and talking with people signing up for it. Wonder if the timing is related
kuzu is a great project.