Comment by ffsm8
2 hours ago
> SREs debugging production outages to find a proximate "root" technical cause is a small fraction of the SRE function.
According to the specified goals of SRE, this is actually not just a small fraction - but something that shouldn't happen. To be clear, I'm fully aware that this will always be necessary - but whenever it happened - it's because the site reliability engineer (SRE) overlooked something.
Hence if that's considered a large part of the job.. then you're just not a SRE as Google defined that role
https://sre.google/sre-book/table-of-contents/
Very little connection to the blog post we're commenting on though - at least as far as I can tell.
At least I didn't find any focus on debugging. It put forward that the capability to produce reliable software is what will distinguish in the future, and I think this holds up and is inline with the official definition of SRE
No comments yet
Contribute on Hacker News ↗