Comment by hnuser123456
2 days ago
"there is a 5% chance your instance is down" is still a partial outage. A green check should only mean everything (about that service) is working for everyone (in that region) as intended.
Downdetector reports started spiking over an hour ago but there still isn't a single status that isn't a green checkmark on the status page.
With highly distributed services there's always something failing, some small percentage.
Sure but you can still put a message up when it's some <numeric value> over some <threshold value> like errors are 50% higher than normal (maybe the SLO is 99.999% of requests are processed successfully)
Just note that aggregations like that might manifest as GCP didn't have any issues today actually.
E.g. it was mostly us-central1 region affected, and in there only some services (e.g. regular instances, and GKE kubernetes were not affected in any region). So if we ask "what the percentage of GCP is down", it might well be it's less than the threshold.
On the other hand, about a month ago, 2025-05-19 there was an 8-hour long incident with Spot VM instances affecting 5 regions, and which was way more important to our company, but it didn't make any headlines.
Just say it: they want to lie to 95% of customers.