Comment by mritchie712

16 hours ago

> Can I use DuckDB with Quack as the catalog database for DuckLake?

> Not yet, but we are working on it!

Seems like a niche use case, but it's the one I'm most interested in.

Our lakehouse uses ducklake with postgres as the catalog. Seems like a DuckDB / Quack catalog would be an excellent alternative.

I think that Quack will become the primary option for a DuckLake catalog in the future, for several reasons. To list a few:

1. No type mismatches for inlining. If you use a non-DuckDB catalog, many types do not have a 1:1 mapping, which introduces additional overhead when operating on those data types.

2. You get the raw performance of DuckDB analytics (and now transactions) over the catalog. DuckDB reading DuckDB is simply faster than any of our Postgres/SQLite scanners.

3. No round-trip for retries. We can easily(tm) run the full retry logic on the DuckDB server side. Right now, these retries trigger multiple round trips for Postgres, making it a performance bottleneck for high-contention workloads.

Disclaimer: I'm a duckdb/ducklake developer.

Well, we are really working on it: https://github.com/duckdb/ducklake/pull/1151

So you'll be able to test it in a few days.

  • Does this mean I can finally connect to a ducklake instnace hosted remotely? i.e. DuckLake is writing to disk on the remote server and my client is just a client.

    Because rn even with Postgres as a catalog my client needs access to the underlying storage to use Ducklake.

    • Yes, Quack resolves this problem. In particular, your client (likely a DuckDB instance) will talk to a remote DuckDB that both has access to the underlying storage and can also serve as the catalog itself.