← Back to context

Comment by reillyse

3 years ago

Nope. Multiple small servers.

1) you need to get over the hump and build in multiple servers into your architecture from the get go (the author says you need two servers minimum), so really we are talking about two big servers.

2) having multiple small servers allows us to spread our service into different availability zones

3) multiple small servers allows us to do rolling deploys without bringing down our entire service

4) once we use the multiple small servers approach it’s easy to scale up and down our compute by adding or removing machines. Having one server it’s difficult to scale up or down without buying more machines. Small servers we can add incrementally but with the large server approach scaling up requires downtime and buying a new server.

The line of thinking you follow is what is plaguing this industry with too much complexity and simultaneously throwing away incredible CPU and PCIe performance gains in favor of using the network.

Any technical decisions about how many instances to have and how they should be spread out needs to start as a business decision and end in crisp numbers about recovery point/time objections, and yet somehow that nearly never happens.

To answer your points:

1) Not necessarily. You can stream data backups to remote storage and recover from that on a new single server as long as that recovery fits your Recovery Time Objective (RTO).

2) What's the benefit of multiple AZs if the SLA of a single AZ is greater than your intended availability goals? (Have you checked your provider's single AZ SLA?)

3) You can absolutely do rolling deploys on a single server.

4) Using one large server doesn't mean you can't compliment it with smaller servers on an as-needed basis. AWS even has a service for doing this.

Which is to say: there aren't any prescriptions when it comes to such decisions. Some businesses warrant your choices, the vast majority do not.

  • > Any technical decisions about how many instances to have and how they should be spread out needs to start as a business decision and end in crisp numbers about recovery point/time objections, and yet somehow that nearly never happens.

    Nobody wants to admit that their business or their department actually has a SLA of "as soon as you can, maybe tomorrow, as long as it usually works". So everything is pretend-engineered to be fifteen nines of reliability (when in reality it sometimes explodes because of the "attempts" to make it robust).

    Being honest about the actual requirements can be extremely helpful.

    • > Nobody wants to admit that their business or their department actually has a SLA of "as soon as you can, maybe tomorrow, as long as it usually works". So everything is pretend-engineered to be fifteen nines of reliability (when in reality it sometimes explodes because of the "attempts" to make it robust).

      I have yet to see my principal technical frustrations summarized so concisely. This is at the heart of everything.

      If the business and the engineers can get over their ridiculous obsession of statistical outcomes and strict determinism, they would be able to arrive at a much more cost effective, simple and human-friendly solution.

      The # of businesses that are actually sensitive to >1 minute of annual downtime are already running on top of IBM mainframes and have been for decades. No one's business is as important as the federal reserve or pentagon, but they don't want to admit it to themselves or others.

      1 reply →

  • > simultaneously throwing away incredible CPU and PCIe performance gains

    We really need to double down on this point. I worry that some developers believe they can defeat the laws of physics with clever protocols.

    The amount of time it takes to round trip the network in the same datacenter is roughly 100,000 to 1,000,000 nanoseconds.

    The amount of time it takes to round trip L1 cache is around half a nanosecond.

    A trip down PCIe isn't much worse, relatively speaking. Maybe hundreds of nanoseconds.

    Lots of assumptions and hand waving here, but L1 cache can be around 1,000,000x faster than going across the network. SIX orders of magnitude of performance are instantly sacrificed to the gods of basic physics the moment you decide to spread that SQLite instance across US-EAST-1. Sure, it might not wind up a million times slower on a relative basis, but you'll never get access to those zeroes again.

  • > 2) What's the benefit of multiple AZs if the SLA of a single AZ is greater than your intended availability goals? (Have you checked your provider's single AZ SLA?)

    … my providers single AZ SLA is less than my company's intended availability goals.

    (IMO our goals are also nuts, too, but it is what it is.)

    Our provider, in the worse case (a VM using a managed hard disk) has an SLA of 95% within a month (I … think. Their SLA page uses incorrect units on the top line items. The examples in the legalese — examples are normative, right? — use a unit of % / mo…).

    You're also assuming a provider a.) typically meets their SLAs and b.) if they don't, honors them. IME, (a) is highly service dependent, with some services being just stellar at it, and (b) is usually "they will if you can prove to them with your own metrics they had an outage, and push for a credit. Also (c.) the service doesn't fail in a way that's impactful, but not covered by SLA. (E.g., I had a cloud provider once whose SLA was over "the APIs should return 2xx", and the APIs during the outage, always returned "2xx, I'm processing your request". You then polled the API and got "2xx your request is pending". Nothing was happening, because they were having an outage, but that outage could continue indefinitely without impacting the SLA! That was a fun support call…)

    There's also (d) AZs are a myth; I've seen multiple global outages. E.g., when something like the global authentication service falls over and takes basically every other service with it. (Because nothing can authenticate. What's even better is the provider then listing those services as "up" / not in an outage, because technically it's not that service that's down, it is just the authentication service. Cause God forbid you'd have to give out that credit. But the provider calling a service "up" that is failing 100% of the requests sent its way is just rich, from the customer's view.)

  • I agree! Our "distributed cloud database" just went down last night for a couple of HOURS. Well, not entirely down. But there were connection issues for hours.

    Guess what never, never had this issue? The hardware I keep in a datacenter lol!

  • > The line of thinking you follow is what is plaguing this industry with too much complexity and simultaneously throwing away incredible CPU and PCIe performance gains in favor of using the network.

    It will die out naturally once people realize how much the times have changed and that the old solutions based on weaker hardware are no longer optimal.

  • Ok, so to your points.

    "It depends" is the correct answer to the question, but the least informative.

    One Big Server or multiple small servers? It depends.

    It always depends. There are many workloads where one big server is the perfect size. There are many workloads where many small servers are the perfect solution.

    What my point is, is that the ideas put forward in the article are flawed for the vast majority of use cases.

    I'm saying that multiple small servers are a better solution on a number of different axis.

    For 1) "One Server (Plus a Backup) is Usually Plenty" Now I need some kind of remote storage streaming system and some kind of manual recovery, am I going to fail over to the backup (and so it needs to be as big as my "One server" or will I need to manually recover from my backup?

    2) Yes it depends on your availability goals, but you get this as a side effect of having more than one small instance

    3) Maybe I was ambiguous here. I don't just mean rolling deploys of code. I also mean changing the server code, restarting, upgrading and changing out the server. What happens when you migrate to a new server (when you scale up by purchasing a different box). Now we have a manual process that doesn't get executed very often and is bound to cause downtime.

    4) Now we have "Use one Big Server - and a bunch of small ones"

    I'm going to add a final point on reliability. By far the biggest risk factor for reliability is me the engineer. I'm responsible for bringing down my own infra way more than any software bug or hardware issue. The probability of me messing up everything when there is one server that everything depends on is much much higher, speaking from experience.

    So. Like I said, I could have said "It depends" but instead I tried to give a response that was someway illuminating and helpful, especially given the strong opinions expressed in the article.

    I'll give a little color with the current setup for a site I run.

    moustachecoffeeclub.com runs on ECS

    I have 2 on-demand instances and 3 spot instances

    One tiny instance running my caches (redis, memcache) One "permanent" small instance running my web server

    Two small spot instances running web server One small spot instance running background jobs

    small being about 3 GB and 1024 CPU units

    And an RDS instance with backup about $67 / month

    All in I'm well under $200 per month including database.

    So you can do multiple small servers inexpensively.

    Another aspect is that I appreciate being able to go on vacation for a couple of weeks, go camping or take a plane flight without worrying if my one server is going to fall over when I'm away and my site is going to be down for a week. In a big company maybe there is someone paid to monitor this, but with a small company I could come back to a smoking hulk of a company and that wouldn't be fun.

    • > All in I'm well under $200 per month including database.

      You forgot all the crucial numbers.. Like QPS.. My blog runs on 0 to 1 Cloud Run instances and costs < 3$ per month, including database

      1 reply →

> you need to get over the hump and build in multiple servers into your architecture from the get go (the author says you need two servers minimum), so really we are talking about two big servers.

Managing a handful of big servers can be done manually if needed - it's not pretty but it works and people have been doing it just fine before the cloud came along. If you intentionally plan on having dozens/hundreds of small servers, manual management becomes unsustainable and now you need a control plane such as Kubernetes, and all the complexity and failure modes it brings.

> having multiple small servers allows us to spread our service into different availability zones

So will 2 big servers in different AZs (whether cloud AZs or old-school hosting providers such as OVH).

> multiple small servers allows us to do rolling deploys without bringing down our entire service

Nothing prevents you from starting multiple instances of your app on one big server nor doing rolling deploys with big bare-metal assuming one server can handle the peak load (so you take out your first server out of the LB, upgrade it, put it back in the LB, then do the same for the second and so on).

> once we use the multiple small servers approach it’s easy to scale up and down our compute by adding or removing machines. Having one server it’s difficult to scale up or down without buying more machines. Small servers we can add incrementally but with the large server approach scaling up requires downtime and buying a new server.

True but the cost premium of the cloud often offsets the savings of autoscaling. A bare-metal capable of handling peak load is often cheaper than your autoscaling stack at low load, therefore you can just overprovision to always meet peak load and still come out ahead.

  • I manage hundreds of servers, and use Ansible. It's simple and it gets the job done. I tried to install Kubernetes on a cluster and couldn't get it to work. I mean I know it works, obviously, but I could not figure it out and decided to stay with what works for me.

    • But it’s specific, and no-one will want to take over your job.

      The upside of a standard AWS CloudFormation file is that engineers are replaceable. They’re cargo-cult engineers, but they’re not worried for their career.

      1 reply →

On a big server, you would probably be running VMs rather than serving directly. And then it becomes easy to do most of what you're talking about - the big server is just a pool of resources from which to make small, single purpose VMs as you need them.

It completely depends on what you doing. This was pointed out in the first paragraph of the article:

> By thinking about the real operational considerations of our systems, we can get some insight into whether we actually need distributed systems for most things.