Comment by twawaaay

3 years ago

Yeah, people frequently overestimate how reliable their app needs to be.

My CEO is constantly beating it into my head: "I will gladly accept a day or even more of downtime from time to time if it lets me get what I want 10% faster".

It does not mean shoddy engineering. It just means consciously deciding not to do some work. Spending couple more days to get the app automatically fail over to another without downtime? Engineer it to run as a distributed system when it could happily run on a single server? Let's just not do it, let's make it simpler and easier to make and accept that if AWS has a failure it may take a moment to spin another instance somewhere else.