Comment by crazygringo
3 months ago
The big dogs absolutely do phase config rollouts as a general rule.
There are still two weaknesses:
1) Some configs are inherently global and cannot be phased. There's only one place to set them. E.g. if you run a webapp, this would be configs for the load balancer as opposed to configs for each webserver
2) Some configs have a cascading effect -- even though a config is applied to 1% of servers, it affects the other servers they interact with, and a bad thing spreads across the entire network
> Some configs are inherently global and cannot be phased
This is also why "it is always DNS". It's not that DNS itself is particularly unreliable, but rather that it is the one area where you can really screw up a whole system by running a single command, even if everything else is insanely redundant.
I don’t believe that there is anything necessarily which requires DNS configs to be global.
You can shard your service behind multiple names:
my-service-1.example.com
my-service-2.example.com
my-service-3.example.com …
Then you can create smoke tests which hit each phase of the DNS and if you start getting errors you stop the rollout of the service.
Sure, but that doesn't really help for user-facing services where people expect to either type a domain name in their browser or click on a search result, and end up on your website every time.
And the access controls of DNS services are often (but not always) not fine-grained enough to actually prevent someone from ignoring the procedure and changing every single subdomain at once.
5 replies →
But users are going to example.com. Not my-service-33.example.com.
So if you've got some configuration that has a problem that only appears at the root-level domain, no amount of subdomain testing is going to catch it.