Comment by saisrirampur

9 hours ago

I personally see a difference between “just use Postgres” and “make Postgres your default choice.” The latter leaves room to evaluate alternatives when the workload calls for it, while the former does not. When that nuance gets lost, it can become misleading for teams that are hitting or even close to hitting—the limits of Postgres, who may continue tuning Postgres spending not only time but also significant $$. IMO a better world is one where developers can have a mindset of using best-in-class where needed. This is where embracing integrations with Postgres will be helpful!

I think that the key point being made by this crowd, of which I'm one, is somewhere in the middle. The way I mean it is "Make Postgres your default choice. Also *you* probably aren't doing anything special enough to warrant using something different".

In other words, there are people and situations where it makes sense to use something else. But most people believing they're in that category are wrong.

  • > Also you probably aren't doing anything special enough to warrant using something different".

    I always get frustrated by this because it is never made clear where the transition occurs to where you are doing something special enough. It is always dismissed as, "well whatever it is you are doing, I am sure you don't need it"

    Why is this assumption always made, especially on sites like HackerNews? There are a lot of us here that DO work with scales and workloads that require specialized things, and we want to be able to talk about our challenges and experiences, too. I don't think we need to isolate all the people who work at large scales to a completely separate forum; for one thing, a lot of us work on a variety of workloads, where some are big enough and particular enough to need a different technology, and some that should be in Postgres. I would love to be able to talk about how to make that decision, but it is always just "nope, you aren't big enough to need anything else"

    I was not some super engineer who already knew everything when I started working on large enough data pipelines that I needed specialized software, with horizontal scaling requirements. Why can't we also talk about that here?

    • And another related one, you’ll know when you’ll need it.

      No I don’t. I’ve never used the thing so I don’t know when it’ll come in useful.

The point is really that you can only evaluate which of alternatives is better once you have working product with data big enough - else it's just basically following trends and hoping your barely informed decision won't be wrong.

  • Postgres is widely used enough with enough engineering company blog posts that the vast majority of NotPostgres requests already have a blog post that either demonstrates that pg falls over at the scale that’s being planned for or it doesn’t.

    If they don’t, the trade off for NotPostgres is such that it’s justifiable to force the engineer to run their own benchmarks before they are allowed to use NotPostgres

  • Agree to disagree here. I see a world where developers need to think about (reasonable) scale from day one, or at least very early. We’ve been seeing this play out at ClickHouse - the need for purpose-built OLAP is reducing from years to months. Also integration with ClickHouse is few weeks of effort for potentially significantly faster performance for analytics.

    • Reasonable scale means... what exactly?

      Here's my opinion: just use postgres. If you're experienced enough to not when I say that, go for it, the advice isn't for you. If you aren't, I'm probably saving you from yourself. "Reasonable scale" to these people could mean dozens of inserts per second, which is why people talking vagueries around scale is madenning to me. If you aren't going to actually say what that means, you will lead people who don't know better down the wrong path.

    • I see a world where developers need to think about REASONABLE scale from day one, with all caps and no parentheses.

      I've sat in on meetings about adding auth rate limiting, using Redis, to an on-premise electron client/Node.js server where the largest installation had 20 concurrent users and the largest foreseeable installation had a few thousand, in which every existing installation had an average server CPU utilisation of less than a percent.

      Redis should not even be a possibility under those circumstances. It's a ridiculous suggestion based purely on rote whiteboard interview cramming. Stick a token_bucket table in Postgres.

      I'm also not convinced that thinking about reasonable scale would lead to a different implementation for most other greenfield projects. The nice thing about shoving everything into Postgres is that you nearly always have a clear upgrade path, whereas using Redis right from the start might actually make the system less future-proof by complicating any eventual migration.

      1 reply →