Building a Distributed Database in Elixir, Part 3: Storage Layer and Why RocksDB

2 months ago (medium.com)

4 comments

gawry

Part 3 of my distributed database series. This one covers the storage engine decision - the foundation everything else sits on.

Main topics: how key encoding lets a single ordered KV store emulate document/graph/time-series models, LSM-tree vs B-tree trade-offs, and the benchmark that killed my pure Elixir dreams.

I wanted to use CubDB (pure Elixir, no NIF risks, easy debugging). The benchmarks said otherwise: RocksDB was 177x faster on writes and used 26,000x less memory during batch operations. For a distributed database, that gap is insurmountable.

The post also covers living with NIFs in Elixir - they bypass the BEAM scheduler, so a crash kills your VM instead of just a process. You architect around it: shard isolation, replication, aggressive monitoring.

Also discussed: RocksDB column families (underrated feature for multi-model storage), write amplification as the LSM-tree tax, and why this approach handles time-series data but won't compete with columnar engines like ClickHouse for pure analytics.

Next post will cover Raft consensus for metadata and how the CP metadata plane coordinates with the AP data plane.

Happy to discuss storage engine choices, NIF risk mitigation, or whether the CubDB benchmarks surprised anyone else who's used it.

hasante 2 months ago

I really love these pure store tools men. The things you can do with them in insane. The bad part is most application developers dont know about them.

gawry 2 months ago

Yes! RocksDB is incredibly powerful with proper setup and key schemas.
Following on that project I might build other things like a stream/queue broker, which will probably fit decently with RocksDB as well

tommica 2 months ago

Love the series! Very educative, and helps me think better in elixir!