← Back to context

Comment by UncleMeat

5 years ago

"Blameless Postmortem" does not mean "No Consequences", even if people often want to interpret it that way. If an organization determines that a disconnect between ground work and a customer's experience is a contributing factor to poor decision making then they might conclude that making engineers more emotionally invested in their customers could be a viable path forward.

Relentless customer service is never going to screw you over in my experience... It pains me that we have to constantly play these games of abstraction between engineer and customer. You are presumably working a job which involves some business and some customer. It is not a fucking daycare. If any of my customers are pissed about their experience, I want to be on the phone with them as soon as humanly possible and I want to hear it myself. Yes, it is a dreadful experience to get bitched at, but it also sharpens your focus like you wouldn't believe when you can't just throw a problem to the guy behind you.

By all means, put the support/enhancement requests through a separate channel+buffer so everyone can actually get work done during the day. But, at no point should an engineer ever be allowed to feel like they don't have to answer to some customer. If you are terrified a junior dev is going to say a naughty phrase to a VIP, then invent an internal customer for them to answer to, and diligently proxy the end customer's sentiment for the engineer's benefit.

  • I think of this is terms of empathy: every engineer should be able to provide a quick and accurate answer to "What do our customers want? And how do they use our product?"

    I'm not talking esoterica, but at least a first approximation.

From the SRE book: "For a postmortem to be truly blameless, it must focus on identifying the contributing causes of the incident without indicting any individual or team for bad or inappropriate behavior. A blamelessly written postmortem assumes that everyone involved in an incident had good intentions and did the right thing with the information they had. If a culture of finger pointing and shaming individuals or teams for doing the 'wrong' thing prevails, people will not bring issues to light for fear of punishment."

If it's really the case that engineers are lacking information about the impact that outages have on customers (which seems rather unlikely), then leadership needs to find a way to provide them with that information without reading customer emails about how the engineers "let them down", which is blameful.

Furthermore, making engineers "emotionally invested" doesn't provide concrete guidance on how to make better decisions in the future. A blameless portmortem does, but you're less likely to get good postmortems if engineers fear shaming and punishment, which reading those customer emails is a minor form of.

  • I work at Google and have written more than a few blameless postmortems. You don't need to quote things to me.

    Is what was described above "finger pointing or shaming"? I don't work in TI so I didn't experience this meeting but it doesn't seem like it is. It also doesn't sound to me like this was the only outcome, where the execs just wagged their fingers at engineers and called it a day. Of course there'd be all sorts of process improvements derived from an understanding of the various system causes that led to an outage.

    • Yes, if I were made to attend a mandatory training in which my leaders read customer emails saying that the outage caused them to lose trust in the company, I would feel ashamed. That was surely the goal of that exercise. The fact that there were also process improvements doesn't make it any less wrong.

      Thankfully, other comments in this thread suggest that this is not how Google normally does things.

      1 reply →