Comment by citguru
3 days ago
Yes, this is actually one of the core problems OpenDuck's architecture addresses.
The short version: OpenDuck interposes a differential storage layer between DuckDB and the underlying file. DuckDB still sees a normal file (via FUSE on Linux or an in-process FileSystem on any platform), but underneath, writes go to append-only layers and reads are resolved by overlaying those layers newest-first. Sealing a layer creates an immutable snapshot.
This gives you:
Many concurrent readers: each reader opens a snapshot, which is a frozen, consistent view of the database. They don't touch the writer's active layer at all. No locks contended.
One serialized write path: multiple clients can submit writes, but they're ordered through a single gateway/primary rather than racing on the same file. This is intentional: DuckDB's storage engine was never designed for multi-process byte-level writes, and pretending otherwise leads to corruption. Instead, OpenDuck serializes mutations at a higher level and gives you safe concurrency via snapshots.
So for your specific scenario — one process writing while you want to quickly inspect or query the DB from the CLI — you'd be able to open a read-only snapshot mount (or attach with ?snapshot=<uuid>) from a second process and query freely. The writer keeps going, new snapshots appear as checkpoints seal, and readers can pick up the latest snapshot whenever they're ready.
It's not unconstrained multi-writer OLTP (that's an explicit non-goal), but it does solve the "I literally cannot even read the database while another process has it open" problem that makes DuckDB painful in practice.
No comments yet
Contribute on Hacker News ↗