← Back to context

Comment by mittermayr

4 days ago

Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS. It's amazing and it works perfectly fine.

For client projects, however, I always try and sell them on paying the AWS fees, simply because it shifts the responsibility of the hardware being "up" to someone else. It does not inherently solve the downtime problem, but it allows me to say, "we'll have to wait until they've sorted this out, Ikea and Disney are down, too."

Doesn't always work like that and isn't always a tried-and-true excuse, but generally lets me sleep much better at night.

With limited budgets, however, it's hard to accept the cost of RDS (and we're talking with at least one staging environment) when comparing it to a very tight 3-node Galera cluster running on Hetzner at barely a couple of bucks a month.

Or Cloudflare, titan at the front, being down again today and the past two days (intermittently) after also being down a few weeks ago and earlier this year as well. Also had SQS queues time out several times this week, they picked up again shortly, but it's not like those things ...never happen on managed environments. They happen quite a bit.

Over 20 year I've had lots of clients on self-hosted, even self-hosting SQL on the same VM as the webserver as you used to in the long distant past for low-usage web apps.

I have never, ever, ever had a SQL box go down. I've had a web server go down once. I had someone who probably shouldn't have had access to a server accidentally turn one off once.

The only major outage I've had (2/3 hours) was when the box was also self-hosting an email server and I accidentally caused it to flood itself with failed delivery notices with a deploy.

I may have cried a little in frustration and panic but it got fixed in the end.

I actually find using cloud hosted SQL in some ways harder and more complicated because it's such a confusing mess of cost and what you're actually getting. The only big complication is setting up backups, and that's a one-off task.

  • Disks go bad. RAID is nontrivial to set up. Hetzner had a big DC outage that lead to data loss.

    Off site backups or replication would help, though not always trivial to fail over.

    • As someone who has set this up while not being a DBA or sysadmin.

      Replication and backups really aren’t that difficult to setup properly with something like Postgres. You can also expose metrics around this to setup alerting if replication lag goes beyond a threshold you set or a backup didn’t complete. You do need to periodically test your backups but that is also good practice.

      I am not saying something like RDS doesn’t have value but you are paying a huge premium for it. Once you get to more steady state owning your database totally makes sense. A cluster of $10-20 VPSes with NVMe drives can get really good performance and will take you a lot farther than you might expect.

      5 replies →

    • So can the cloud, and cloud has had more major outages in the last 3 months than I've seen on self-hosted in 20 years.

      Deploys these days take minutes so what's the problem if a disk does go bad? You lose at most a day of data if you go with the 'standard' overnight backups, and if it's mission critical, you will have already set up replicas, which again is pretty trivial and only slightly more complicated than doing it on cloud hosts.

      1 reply →

    • For this kind of small scale setup, a reasonable backup strategy is all you need for that. The one critical part is that you actually verify your backups are done and work.

      Hardware doesn't fail that often. A single server will easily run many years without any issues, if you are not unlucky. And many smaller setups can tolerate the downtime to rent a new server or VM and restore from backup.

    • Not as often as you might think. Hardware doesn’t fail like it used to.

      Hardware also monitors itself reasonably well because the hosting providers use it.

      It’s trivial to run a mirrored containers on two separate proxmox nodes because hosting providers use the same kind of stuff.

      Offsite backups and replication? Also point and click and trivial with tools like Proxmox.

      RAID is actually trivial to setup.l if you don’t compare it to doing it manually yourself from the command line. Again, tools like Proxmox make it point and click and 5 minutes of watching from YouTube.

      If you want to find a solution our brain will find it. If we don’t we can find reasons not to.

      2 replies →

    • One thing that will always stick in my mind is one time I worked at a national Internet service provider.

      The log disk was full or something. That's not the shameful part though. What followed is a mass email saying everyone needs to update their connection string from bla bla bla 1 dot foo dot bar to bla bla bla 2 dot foo dot bar

      This was inexcusable to me. I mean this is an Internet service provider. If we can't even figure out DNS, we should shut down the whole business and go home.

    • They, do, it isn't, cloud providers also go bad.

      > Off site backups or replication would help, though not always trivial to fail over.

      You want those regardless of where you host

    • > RAID is nontrivial to set up.

      Skill issue?

      It's not 2003, modern volume-managing filesystems (eg:ZFS) make creating and managing RAID trivial.

Me: “Why are we switching from NoNameCMS to Salesforce?”

Savvy Manager: “NoNameCMS often won’t take our support calls, but if Salesforce goes down it’s in the WSJ the next day.”

  • Just wait until you end up spending $100,000 for an awful implantation from a partner who pretends to understand your business need but delivers something that doesn’t work.

    But perhaps I’m bitter from prior Salesforce experiences.

> but it allows me to say, "we'll have to wait until they've sorted this out, Ikea and Disney are down, too."

From my experience your client’s clients don’t care about this when they’re still otherwise up.

  • Yes but the fact that it's "not their fault" keeps the person from getting fired.

    Don't underestimate the power of CYA

    • This is a major reason the cloud commands such a premium. It’s a way to make down time someone else’s problem.

      The other factor is eliminating the “one guy who knows X” problem in IT. What happens if that person leaves or you have to let them go? But with managed infrastructure there’s a pool of people who know how to write terraform or click buttons and manage it and those are more interchangeable than someone’s DIY deployment. Worst case the cloud provider might sell you premium support and help. Might be expensive but you’re not down.

      Lastly, there’s been an exodus of talent from IT. The problem is that anyone really good can become a coder and make more. So finding IT people at a reasonable cost who know how to really troubleshoot and root cause stuff and engineer good systems is very hard. The good ones command more of a programmer salary which makes the gap with cloud costs much smaller. Might as well just go managed cloud.

      5 replies →

  • From my experience, this completely disavows you from an otherwise reputation damaging experience.

You can still outsource up to VM level and handle everything else on you own.

Obviously it depends on the operational overhead of specific technology.

> Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS

It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security? Many people are technically unable, lack the time or the resources to be able to confidently address that question, hence paying for someone else to do it.

Your time is money though. You are saving money but giving up time.

Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).

  • You can pay someone else to manage your hardware stack, there are literal companies that will just keep it running, while you just deploy your apps on that.

    > It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security?

    There is one advantage self hosted setup has here, if you set up VPN, only your employees have access, and you can have server not accessible from the internet. So even in case of zero day that WILL make SaaS company leak your data, you can be safe(r) with self-hosted solution.

    > Your time is money though. You are saving money but giving up time.

    The investment compounds. Setting up infra to run a single container for some app takes time and there is good chance it won't pay back for itself.

    But 2nd service ? Cheaper. 5th ? At that point you probably had it automated enough that it's just pointing it at docker container and tweaking few settings.

    > Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).

    It's cheaper if you include your own time. You pay a technical person at your company to do it. Saas company does that, then pays sales and PR person to sell it, then pays income tax to it, then it also needs to "pay" investors.

    Yeah making a service for 4 people in company can be more work than just paying $10/mo to SaaS company. But 20 ? 50 ? 100 ? It quickly gets to point where self hosting (whether actually "self" or by using dedicated servers, or by using cloud) actually pays off

  • > Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).

    In a business context the "time is money" thing actually makes sense, because there's a reasonable likelihood that the business can put the time to a more profitable use in some other way. But in a personal context it makes no sense at all. Realistically, the time I spend cooking or cleaning was not going to earn me a dime no matter what else I did, therefore the opportunity cost is zero. And this is true for almost everyone out there.

  • Yea I agree.. better outsource product development, management, and everything else too by that narrative

    • Unironically - I agree. You should be outsourcing things that aren't your core competency. I think many people on this forum have a certain pride about doing this manually, but to me it wouldn't make sense in any other context.

      Could you imagine accountants arguing that you shouldn't use a service like Paychex or Gusto and just run payroll manually? After all it's cheaper! Just spend a week tracking taxes, benefits and signing checks.

      Self-hosting, to me, doesn't make sense unless you are 1.) doing something not offered by the cloud or a pathological use case 2.) or running a hobby project or 3.) you are in maintaince mode on the product. Otherwise your time is better spent on your core product - and if it isn't, you probably aren't busy enough. If the cost of your RDS cluster is so expensive relative to your traffic, you probably aren't charging enough or your business economics really don't make sense.

      I've managed large database clusters (MySQL, Cassandra) on bare metal hardware in managed colo in the past. I'm well aware of the performance thats being left on the table and what the cost difference is. For the vast majority of businesses, optimizing for self hosting doesn't make sense, especially if you don't have PMF. For a company like 37signals, sure, product velocity probably is very high, and you have engineering cycles to spare. But if you aren't profitable, self hosting won't make you profitable, and your time is better spent elsewhere.

      9 replies →

    • That’s pretty reductive. By that logic the opposite extreme is just as true: if using managed services is just as bad as outsourcing everything else, then a business shouldn’t rent real estate either—every business should build and own their own facility. They should also never contract out janitorial work, nor should they retain outside law firms—they should hire and staff those departments internally, every time, no nuance allowed.

      You see the issue?

      Like, I’m all for not procuring things that it makes more sense to own/build (and I know most businesses have piss-poor instincts on which is which—hell, I work for the government! I can see firsthand the consequences of outsourcing decision making to contractors, rather than just outsourcing implementation).

      But it’s very case-by-case. There’s no general rule like “always prefer self hosting” or “always rent real estate, never buy” that applies broadly enough to be useful.

      7 replies →

That argument does not hold when there is aws serverless pg available, which cost almost nothing for low traffic and is vastly superior to self hosting regarding observability, security, integration, backup ect...

There is no reason to self manage pg for dev / environnement.

https://aws.amazon.com/rds/aurora/serverless/

  • "which cost almost nothing for low traffic" you invented the retort "what about high traffic" within your own message. I don't even necessarily mean user traffic either. But if you constantly have to sync new records over (as could be the case in any kind of timeseries use-case) the internal traffic could rack up costs quickly.

    "vastly superior to self hosting regarding observability" I'd suggest looking into the cnpg operator for Postgres on Kubernetes. The builtin metrics and official dashboard is vastly superior to what I get from Cloudwatch for my RDS clusters. And the backup mechanism using Barman for database snapshots and WAL backups is vastly superior to AWS DMS or AWS's disk snapshots which aren't portable to a system outside of AWS if you care about avoiding vendor lock-in.

  • This was true for RDS serverless v1 which scaled to 0 but is no longer offered. V2 requires a minimum 0.5 ACU hourly commit ($40+ /mo).

  • Aurora serverless requires provisioned compute - it’s about $40/mo last time I checked.

    • The performance disparity is just insane.

      Right now from Hetzner you can get a dedicated server with 6c/12t Ryzen2 3600, 64GB RAM and 2x512GB Nvme SSD for €37/mo

      Even if you just served files from disc, no RAM, that could give 200k small files per second.

      From RAM, and with 6 dedicated cores, network will saturate long before you hit compute limits on any reasonably efficient web framework.

  • Just use a pg container on a vm, cheap as chips and you can do anything to em.