Comment by vvern
4 hours ago
About time for the 6th Edition, eh? What would folks include in it?
- Vector databases and hybrid search?
- Object storage for all the things? Lake houses. Parquet and beyond.
- Continuously materialized views? I'm not sure this one has made the splash but I think about Naiad (Materialize) and Noria (Readyset)
- NewSQL went mostly mainstream (Spanner wasn't included in the last one, but there's been more here with things like CockroachDB, TiDB, etc)
The object storage stuff is new, but it's mostly confirmed that the older architecture works. MPP with shared (S3) storage and everything above that on local SSD and compute delivers the best performance. Even Snowflake finally came out with "interactive" warehouses with this architecture.
Parquet, Iceberg, and other open formats seem good, but they may hit a complexity wall. There's already some inconsistency between platforms, eg with delete vectors.
Incremental view maintenance interests me as well, and I would like to see it more available on different platforms. It's ironic that people use dbt etc. to test every little edit of their manually coded delta pipelines, but don't look at IVM.
LLMs as DBs (if you squint hard enough)