Comment by nagyf

3 years ago

That is not what I would call a "moment". If you have to rewrite your application code to migrate to a different platform, that's not going to be a moment.

Even if you abstract away most of the platform specific stuff in your code, that's going to take days/weeks of implementation and testing before you can go live. That won't help you when a provider suddenly bans your account in the middle of the night, and you need it running asap.

> If you have to rewrite your application code to migrate to a different platform, that's not going to be a moment.

Having a couple of days of downtime might be an acceptable tradeoff for an event with a very low chance of happening. Risk management basically. (If your business doesn’t survive that downtime it might be a completely different story of course.)

  • Yeah, people frequently overestimate how reliable their app needs to be.

    My CEO is constantly beating it into my head: "I will gladly accept a day or even more of downtime from time to time if it lets me get what I want 10% faster".

    It does not mean shoddy engineering. It just means consciously deciding not to do some work. Spending couple more days to get the app automatically fail over to another without downtime? Engineer it to run as a distributed system when it could happily run on a single server? Let's just not do it, let's make it simpler and easier to make and accept that if AWS has a failure it may take a moment to spin another instance somewhere else.