Comment by agwa

5 years ago

From the SRE book: "For a postmortem to be truly blameless, it must focus on identifying the contributing causes of the incident without indicting any individual or team for bad or inappropriate behavior. A blamelessly written postmortem assumes that everyone involved in an incident had good intentions and did the right thing with the information they had. If a culture of finger pointing and shaming individuals or teams for doing the 'wrong' thing prevails, people will not bring issues to light for fear of punishment."

If it's really the case that engineers are lacking information about the impact that outages have on customers (which seems rather unlikely), then leadership needs to find a way to provide them with that information without reading customer emails about how the engineers "let them down", which is blameful.

Furthermore, making engineers "emotionally invested" doesn't provide concrete guidance on how to make better decisions in the future. A blameless portmortem does, but you're less likely to get good postmortems if engineers fear shaming and punishment, which reading those customer emails is a minor form of.

I work at Google and have written more than a few blameless postmortems. You don't need to quote things to me.

Is what was described above "finger pointing or shaming"? I don't work in TI so I didn't experience this meeting but it doesn't seem like it is. It also doesn't sound to me like this was the only outcome, where the execs just wagged their fingers at engineers and called it a day. Of course there'd be all sorts of process improvements derived from an understanding of the various system causes that led to an outage.

  • Yes, if I were made to attend a mandatory training in which my leaders read customer emails saying that the outage caused them to lose trust in the company, I would feel ashamed. That was surely the goal of that exercise. The fact that there were also process improvements doesn't make it any less wrong.

    Thankfully, other comments in this thread suggest that this is not how Google normally does things.