← Back to context

Comment by swaminarayan

6 hours ago

How well does the Postgres-only approach hold up as data grows — did you benchmark it against Elasticsearch or a dedicated vector DB?

I've done small scale experiments with up to 100-500k rows, and did not notice any significant degradation in search query latency - p95 still well under 1s.

I haven't directly compared against Elasticsearch yet, but I plan to do that next and publish some numbers. There's a benchmark harness setup already: https://github.com/getomnico/omni/tree/master/benchmarks, but there's a couple issues with it right now that I need to address first before I do a large scale run (the ParadeDB index settings need some tuning).

we have a pretty intensively used postgres backed app handling thousands of users concurrently. After 6 years and thousands of paying custoners, we are only now approaching to the limits of what it can support on the horizon. TLDR: when you get there, you can hire some people to help you break things off as needed. if you're still trying to prove your business model and carve yoruself a segment of the market, just use postgres

  • Thanks for sharing! Big part of the reason why I decided on postgres, everything I've read about people using it in prod tells me that most organizations never really grow beyond requiring anything more than what it offers.

    • Most of the time just re-casting what you want in a horizontally shardable way is the "right" way to do it with any rdbms (if you scale) but at this point you can get boxes on AWS with 32TiB of ram, and most organizations don't have that much total data across their entire suite of stuff (many do, most don't.)