← Back to context

Comment by pinkgolem

4 days ago

>At a smaller business I worked at, I was able to use these services to achieve uptime and performance that I couldn’t achieve self-hosted, because I had to spend time on the product itself. So yeah, we’d saved on infrastructure engineers.

How sure are you about that one? All of my hetzner vm`s reach an uptime if 99.9% something.

I could see more then one small business stack fitting onto a single of those vm`s.

100% certain because I started by self hosting before moving to AWS services for specific components and improved the uptime and reduced the time I spent keeping those services alive.

  • What was work you spend configuring those services and keeping them alive? I am genuinely curious...

    We have a very limited set of services, but most have been very painless to maintain.

    • A Django+Celery app behind Nginx back in the day. Most maintenance would be discovering a new failure mode:

      - certificates not being renewed in time

      - Celery eating up all RAM and having to be recycled

      - RabbitMQ getting blocked requiring a forced restart

      - random issues with Postgres that usually required a hard restart of PG (running low on RAM maybe?)

      - configs having issues

      - running out of inodes

      - DNS not updating when upgrading to a new server (no CDN at the time)

      - data centre going down, taking the provider’s email support with it (yes, really)

      Bear in mind I’m going back a decade now, my memory is rusty. Each issue was solvable but each would happen at random and even mitigating them was time that I (a single dev) was not spending on new features or fixing bugs.

      5 replies →

Just because your VM is running doesn't mean the service is accessible. Whenever there's a large AWS outage it's usually not because the servers turned off. It also doesn't guarantee that your backups are working properly.

  • If you have a server where everything is on the server, the server being on means everything is online... There is not a lot of complexity going on inside a single server infrastructure.

    I mean just because you have backups does not mean you can restore them ;-)

    We do test backup restoration automatically and also on a quarterly basis manually, but so you should do with AWS.

    Otherwise how do you know you can restore system a without impact other dependency, d and c