← Back to context

Comment by molf

4 days ago

> I'd argue self-hosting is the right choice for basically everyone, with the few exceptions at both ends of the extreme:

> If you're just starting out in software & want to get something working quickly with vibe coding, it's easier to treat Postgres as just another remote API that you can call from your single deployed app

> If you're a really big company and are reaching the scale where you need trained database engineers to just work on your stack, you might get economies of scale by just outsourcing that work to a cloud company that has guaranteed talent in that area. The second full freight salaries come into play, outsourcing looks a bit cheaper.

This is funny. I'd argue the exact opposite. I would self host only:

* if I were on a tight budget and trading an hour or two of my time for a cost saving of a hundred dollars or so is a good deal; or

* at a company that has reached the scale where employing engineers to manage self-hosted databases is more cost effective than outsourcing.

I have nothing against self-hosting PostgreSQL. Do whatever you prefer. But to me outsourcing this to cloud providers seems entirely reasonable for small and medium-sized businesses. According to the author's article, self hosting costs you between 30 and 120 minutes per month (after setup, and if you already know what to do). It's easy to do the math...

> employing engineers to manage self-hosted databases is more cost effective than outsourcing

Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

PaaS platforms (Heroku, Render, Railway) can legitimately be operated by your average dev and not have to hire a dedicated person; those cost even more though.

Another limitation of both the cloud and PaaS is that they are only responsible for the infrastructure/services you use; they will not touch your application at all. Can your application automatically recover from a slow/intermittent network, a DB failover (that you can't even test because your cloud providers' failover and failure modes are a black box), and so on? Otherwise you're waking up at 3am no matter what.

  • > Every company out there is using the cloud and yet still employs infrastructure engineers

    Every company beyond a particular size surely? For many small and medium sized companies hiring an infrastructure team makes just as little sense as hiring kitchen staff to make lunch.

    • For small companies things like vercel, supabase, firebase, ... wipe the floor with Amazon RDS.

      For medium sized companies you need "devops engineers". And in all honesty, more than you'd need sysadmins for the same deployment.

      For large companies, they split up AWS responsibilities into entire departments of teams (for example, all clouds have math auth so damn difficult most large companies have -not 1- but multiple departments just dealing with authorization, before you so much as start your first app)

    • You're paying people to do the role either way, if it's not dedicated staff then it's taking time away from your application developers so they can play the role of underqualified architects, sysadmins, security engineers.

      16 replies →

    • It depends very much what the company is doing.

      At my last two places it very quickly got to the point where the technical complexity of deployments, managing environments, dealing with large piles of data, etc. meant that we needed to hire someone to deal with it all.

      They actually preferred managing VMs and self hosting in many cases (we kept the cloud web hosting for features like deploy previews, but that’s about it) to dealing with proprietary cloud tooling and APIs. Saved a ton of money, too.

      On the other hand, the place before that was simple enough to build and deploy using cloud solutions without hiring someone dedicated (up to at least some pretty substantial scale that we didn’t hit).

  • > Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

    This doesn’t make sense as an argument. The reason the cloud is more complex is because that complexity is available. Under a certain size, a large number of cloud products simply can’t be managed in-house (and certainly not altogether).

    Also your argument is incorrect in my experience.

    At a smaller business I worked at, I was able to use these services to achieve uptime and performance that I couldn’t achieve self-hosted, because I had to spend time on the product itself. So yeah, we’d saved on infrastructure engineers.

    At larger scales, what your false dichotomy suggests also doesn’t actually happen. Where I work now, our data stores are all self-managed on top of EC2/Azure, where performance and reliability are critical. But we don’t self-host everything. For example, we use SES to send our emails and we use RDS for our app DB, because their performance profiles and uptime guarantees are more than acceptable for the price we pay. That frees up our platform engineers to spend their energy on keeping our uptime on our critical services.

    • >At a smaller business I worked at, I was able to use these services to achieve uptime and performance that I couldn’t achieve self-hosted, because I had to spend time on the product itself. So yeah, we’d saved on infrastructure engineers.

      How sure are you about that one? All of my hetzner vm`s reach an uptime if 99.9% something.

      I could see more then one small business stack fitting onto a single of those vm`s.

      10 replies →

    • Yes, mix-and-match is the way to go, depending on what kind of skills are available in your team. I wouldn't touch a mail server with a 10-foot pole, but I'll happily self-manage certain daemons that I'm comfortable with.

      Just be careful not to accept more complexity just because it is available, which is what the AWS evangelists often try to sell. After all, we should always make an informed decision when adding a new dependency, whether in code or in infrastructure.

      1 reply →

  • > still employs infrastructure engineers

    > The "cloud" reducing staff costs

    Both can be true at the same time.

    Also:

    > Otherwise you're waking up at 3am no matter what.

    Do you account for frequency and variety of wakeups here?

    • > Do you account for frequency and variety of wakeups here?

      Yes. In my career I've dealt with way more failures due to unnecessary distributed systems (that could have been one big bare-metal box) rather than hardware failures.

      You can never eliminate wake-ups, but I find bare-metal systems to have much less moving parts means you eliminate a whole bunch of failure scenarios so you're only left with actual hardware failure (and HW is pretty reliable nowadays).

      1 reply →

  • In-house vs Cloud Provider is largely a wash in terms of cost. Regardless of the approach, you are going need people to maintain stuff and people cost money. Similarly compute and storage cost money so what you lose on the swings, you gain on the roundabouts.

    In my experience you typically need less people if using a Cloud Provider than in-house (or the same number of people can handle more instances) due to increased leverage. Whether you can maximize what you get via leverage depends on how good your team is.

    US companies typically like to minimize headcount (either through accounting tricks or outsourcing) so usually using a Cloud Provider wins out for this reason alone. It's not how much money you spend, it's how it looks on the balance sheet ;)

  • Working in a university Lab self-hosting is the default for almost anything. While I would agree that cost are quite low, I sometimes would be really happy to throw money at problems to make them go away. Without having the chance and thus being no expert, I really see the opportunity of scaling (up and down) quickly in the cloud. We ran a postgres database of a few 100 GB with multiple read replica and we managed somehow, but actually really hit our limits of expertise at some point. At some point we stopped migrating to newer database schemas because it was just such a hassle keeping availability. If I had the money as company, I guess I would have paid for a hosted solution.

  • I don’t think it’s a lie, it’s just perhaps overstated. The number of staff needed to manage a cloud infrastructure is definitely lower than that required to manage the equivalent self-hosted infrastructure.

    Whether or not you need that equivalence is an orthogonal question.

    • > The number of staff needed to manage a cloud infrastructure is definitely lower than that required to manage the equivalent self-hosted infrastructure.

      There's probably a sweet spot where that is true, but because cloud providers offer more complexity (self-inflicted problems) and use PR to encourage you to use them ("best practices" and so on) in all the cloud-hosted shops I've been in a decade of experience I've always seen multiple full-time infra people being busy with... something?

      There was always something to do, whether to keep up with cloud provider changes/deprecations, implementing the latest "best practice", debugging distributed systems failures or self-inflicted problems and so on. I'm sure career/resume polishing incentives are at play here too - the employee wants the system to require their input otherwise their job is no longer needed.

      Maybe in a perfect world you can indeed use cloud-hosted services to reduce/eliminate dedicated staff, but in practice I've never seen anything but solo founders actually achieve that.

      1 reply →

    • Exactly. Companies with cloud infra often still have to hire infra people or even an infra team, but that team will be smaller than if they were self-hosting everything, in some cases radically smaller.

      I love self-hosting stuff and even have a bias towards it, but the cost/time tradeoff is more complex than most people think.

  • The fact that as many engineers are on payroll doesn't mean that "cloud" is not an efficiency improvement. When things are easier and cheaper, people don't do less or buy less. They do more and buy more until they fill their capacity. The end result is the same number (or more) of engineers, but they deal with a higher level of abstraction and achieve more with the same headcount.

  • I can't talk about staff costs, but as someone who's self-hosted Postgres before, using RDS or Supabase saves weeks of time on upgrades, replicas, tuning, and backups (yeah, you still need independent backups, but PITRs make life easier). Databases and file storage are probably the most useful cloud functionality for small teams.

    If you have the luxury of spending half a million per year on infrastructure engineers then you can of course do better, but this is by no means universal or cost-effective.

  • Well sure you still have 2 or 3 infra people but now you don’t need 15. Comparing to modern Hetzner is also not fair to “cloud” in the sense that click-and-get-server didn’t exist until cloud providers popped up. That was initially the whole point. If bare metal behind an API existed in 2009 the whole industry would look very different. Contingencies Rule Everything Around Me.

You are missing that most services don't have high availability needs and don't need to scale.

Most projects I have worked on in my career have never seen more than a hundred concurrent users. If something goes down on Saturday, I am going to fix it on Monday.

I have worked on internal tools were I just added a postgres DB to the docker setup and that was it. 5 Minute of work and no issues at all. Sure if you have something customer facing, you need to do a bit more and setup a good backup strategy but that really isn't magic.

> at a company that has reached the scale where employing engineers to manage self-hosted databases is more cost effective than outsourcing.

This is the crux of one of the most common fallacies in software engineering decision making today. I've participated in a bunch of architecture / vendor evaluations that concluded managed services are more cost effective almost purely because they underestimated (or even discarded entirely) the internal engineering cost of vendor management. Black box debugging is one of the most time costuming engineering pursuits, & even when it's something widely documented & well supported like RDS, it's only really tuned for the lowest common denominator - the complexities of tuning someone else's system at scale can really add up to only marginally less effort than self-hosting (if there's any difference at all).

But most importantly - even if it's significantly less effort than self-hosting, it's never effectively costed when evaluating trade-offs - that's what leads to this persistent myth about the engineering cost of self-hosting. "Managing" managed services is a non-zero cost.

Add to that the ultimate trade-off of accountability vs availability (internal engineers care less about availability when it's out of there hands - but it's still a loss to your product either way).

  • > Black box debugging is one of the most time costuming engineering pursuits, & even when it's something widely documented & well supported like RDS, it's only really tuned for the lowest common denominator - the complexities of tuning someone else's system at scale can really add up to only marginally less effort than self-hosting (if there's any difference at all).

    I'm really not sure what you're talking about here. I manage many RDS clusters at work. I think in total, we've spent maybe eight hours over the last three years "tuning" the system. It runs at about 100kqps during peak load. Could it be cheaper or faster? Probably, but it's a small fraction of our total infra spend and it's not keeping me up at night.

    Virtually all the effort we've ever put in here has been making the application query the appropriate indexes. But you'd do no matter how you host your database.

    Hell, even the metrics that RDS gives you for free make the thing pay for itself, IMO. The thought of setting up grafana to monitor a new database makes me sweat.

    • > Could it be cheaper or faster? Probably

      Ultimately, it depends on your stack & your bottlenecks. If you can afford to run slower queries then focusing your efforts elsewhere makes sense for you. We run ~25kqps average & mostly things are fine, but when on-call pages come in query performance is a common culprit. The time we've spent on that hasn't been significantly different to self-hosted persistence backends I've worked with (probably less time spent but far from orders of magnitudes - certainly not worthy of a bullet point in the "pros" column when costing application architectures.

      1 reply →

its not. I've been in a few shops that use RDS because they think their time is better spend doing other things.

except now they are stuck trying to maintain and debug Postgres without having the same visibility and agency that they would if they hosted it themselves. situation isn't at all clear.

  • One thing unaccounted for if you've only ever used cloud-hosted DBs is just how slow they are compared to a modern server with NVME storage.

    This leads the developers to do all kinds of workarounds and reach for more cloud services (and then integrating them and - often poorly - ensuring consistency across them) because the cloud hosted DB is not able to handle the load.

    On bare-metal, you can go a very long way with just throwing everything at Postgres and calling it a day.

    • 100% this directly connected nvme is a massive win. Often several orders of magnitude.

      You can take it even further in some context if you use sqlite.

      I think one of the craziest ideas of the cloud decade was to move storage away from compute. It's even worse with things like AWS lambda or vercel.

      Now vercel et al are charging you extra to have your data next to your compute. We're basically back to VMs at 100-1000x the cost.

    • Yeah our cloud DBs all have abysmal performance and high recurring cost even compared to metal we didn't even buy for hosting DBs.

    • This is the reason I manage SQL Server on a VM in Azure instead of their PaaS offering. The fully managed SQL has terrible performance unless you drop many thousands a month. The VM I built is closer to 700 a month.

      Running on IaaS also gives you more scalability knobs to tweak: SSD Iops and b/w, multiple drives for logs/partitions, memory optimized VMs, and there's a lot of low level settings that aren't accessible in managed SQL. Licensing costs are also horrible with managed SQL Server, where it seems like you pay the Enterprise level, but running it yourself offers lower cost editions like Standard or Web.

  • Interesting. Is this an issue with RDS?

    I use Google Cloud SQL for PostgreSQL and it's been rock solid. No issues; troubleshooting works fine; all extensions we need already installed; can adjust settings where needed.

    • its more of a general condition - its not that RDS is somehow really faulty, its just that when things do go wrong, its not really anybody's job to introspect the system because RDS is taking care of it for us.

      in the limit I dont think we should need DBAs, but as long as we need to manage indices by hand, think more than 10 seconds about the hot queries, manage replication, tune the vacuumer, track updates, and all the other rot - then actually installing PG on a node of your choice is really the smallest of problems you face.

| self hosting costs you between 30 and 120 minutes per month

Can we honestly say that cloud services taking a half hour to two hours a month of someone's time on average is completely unheard of?

  • I handle our company's RDS instances, and probably spend closer to 2 hours a year than 2 hours a month over the last 8 years.

    It's definitely expensive, but it's not time-consuming.

    • Of course. But people also have high uptime servers with long-running processes they barely touch.

  • Very much depends on what you're doing in the cloud, how many services you are using, and how frequently those services and your app needs updates.

Self hosting does not cost you that much at all. It's basically zero once you've got backups automated.

I also encourage people to just use managed databases. After all, it is easy to replace such people. Heck actually you can fire all of them and replace the demand with genAI nowadays.

Agreed. As someone in a very tiny shop, all us devs want to do as little context switching to ops as possible. Not even half a day a month. Our hosted services are in aggregate still way cheaper than hiring another person. (We do not employ an "infrastructure engineer").

The discussion isn't "what is more effective". The discussion is "who wants to be blamed in case things go south". If you push the decision to move to self-hosted and then one of the engineers fucks up the database, you have a serious problem. If same engineer fucks up cloud database, it's easier to save your own ass.

> trading an hour or two of my time

pacman -S postgresql

initdb -D /pathto/pgroot/data

grok/claude/gpt: "Write a concise Bash script for setting up an automated daily PostgreSQL database backup using pg_dump and cron on a Linux server, with error handling via logging and 7-day retention by deleting older backups."

ctrl+c / ctrl+v

Yeah that definitely took me an hour or two.

  • So your backups are written to the same disk?

    > datacenter goes up in flames

    > 3-2-1 backups: 3 copies on 2 different types of media with at least 1 copy off-site. No off-site copy.

    Whoops!