Comment by Aurornis
3 hours ago
I’ve written before on HN about when my employer hired several ex-FAANG people to manage all things cloud in our company.
Whenever there was an outage they would put up a fight against anyone wanting to update the status page to show the outage. They had so many excuses and reasons not to.
Eventually we figured out that they were planning to use the uptime figures for requesting raises and promos as they did at their FAANG employer, so anything that reduced that uptime number was to be avoided at all costs.
Are there companies that actually use their statuspage as a source of truth for uptime numbers?
I think it's way more common for companies to have a public status page, and then internal tooling that tracks the "real" uptime number. (E.g. Datadog monitors, New Relic monitoring, etc)
(Your point still stands though.)
I don’t know, but I will say that this team that was hired into our company was so hyperfocused on any numbers they planned to use for performance reviews that it probably didn’t matter which service you chose to measure the website performance. They’d find a way to game it. If we had used the internal devops observability tools I bet they would have started pulling back logging and reducing severity levels as reported in the codebase.
It’s obviously not a problem at every company because there are many companies who will recognize these shenanigans and come down hard on them. However you could tell these guys could recognize any opportunity to game the numbers if they thought those numbers would come up at performance review time.
Ironically our CEO didn’t even look at those numbers. He used the site and remembered the recent outages.
[Datadog employee here] https://updog.ai tracks the uptime of multiple services by real impact across Datadog customers.