Comment by nijave
3 days ago
Sure but you can still put a message up when it's some <numeric value> over some <threshold value> like errors are 50% higher than normal (maybe the SLO is 99.999% of requests are processed successfully)
3 days ago
Sure but you can still put a message up when it's some <numeric value> over some <threshold value> like errors are 50% higher than normal (maybe the SLO is 99.999% of requests are processed successfully)
Just note that aggregations like that might manifest as GCP didn't have any issues today actually.
E.g. it was mostly us-central1 region affected, and in there only some services (e.g. regular instances, and GKE kubernetes were not affected in any region). So if we ask "what the percentage of GCP is down", it might well be it's less than the threshold.
On the other hand, about a month ago, 2025-05-19 there was an 8-hour long incident with Spot VM instances affecting 5 regions, and which was way more important to our company, but it didn't make any headlines.