Comment by lathiat
4 years ago
I often see "this person should be fired" etc in response to legitimate mistakes.. but that's massively undervaluing the mistake.
There's a pretty good chance the person who made such a mistake is less likely to repeat it or other mistakes than someone who hasn't yet had the experience. So at best you're just sending a more valuable employee to one of your competitors.
There's some gray area when it comes to making a poor judgement call to skip a well defined procedure or be lazy.. and some cases this doesn't apply to like intentional malice or a continued series of similar obviously avoidable mistakes.
On a related note im often frustrated when people play a blame game (even if one might be somewhat called for, in a way) instead of considering "human factors" and how to systematically avoid people being able to make such mistakes. Or refuse to accept a human factors style explanation and just say the person "should have known better" etc. The best high profile example was that hawaii missile alert that was supposed to be a test having an incredibly poor UI. But there are countless examples everywhere all the time. I often try to think about this a lot in my work.
How's the anecdote go... "Why would I fire him? I just spend $200,000 training him not to do this thing, he'll never make this mistake again!"
According to one source:
“Recently, I was asked if I was going to fire an employee who made a mistake that cost the company $600,000. No, I replied, I just spent $600,000 training him. Why would I want somebody to hire his experience?”
– Thomas John Watson Sr., IBM
Wasn’t it, “I just spent 1,000,000 on training there’s no way I could afford to fire you.”
Ah, that's I think closer to what I was remembering!
If someone can make these kinds of mistakes, IMHO it is usually not their fault and instead is the fault of the systems that failed to prevent it from happening.
It is still their fault -- they did something they shouldn't have done. But the mistake exposes a bigger problem in the underlying infra, which is a good thing.
I think it's harmful to ascribe fault as a binary thing.
Assuming, of course, that this wasn't some deliberate act (because that would be weird):
The person who ultimately pressed the button which caused the code to run that sent this email only shares some portion of the fault. Maybe that person even wrote and deployed the code.
There's many other deficient processes that led to this even being possible - why did test code run in a place that had access to production credentials? what caused the code to run in the first place - was it accidentally triggered by some other bug, or deliberately run by somebody who didn't realize they were in production? If so, why are their systems built in a way that it's hard to realize when you're in production? Why is the system architected in such a way that large quantities of email can be sent inadvertently without some sort of approval? You could always delay large batches and send an alert so a human on-call could be in the loop to detect and delay such emails.
You can't say that without knowing the details.
I've definitely seen issues where the engineers at the keyboards that day weren't at all at fault, and were just doing exactly what was asked of them, but systemic issues caused something like this. You can blame poor tech hygiene by the whole team, and lack of foresight by the manager, but most of that would be 20/20 hindsight.
This is why blameless postmortems are a good thing, because humans are simply awful with hindsight bias.
Best thing to do is just figure out how not to do it in the future.
On the list of screw-ups I've seen, this is pretty benign. I'm sure not long after this happened, they either realized their mistake or a flaw on the system they were working with. No reprimand needed, maybe some joking criticism to ease the stress I'm sure they're feeling knowing some asshat manager might actually think this is terrible and do something stupid like fire them.
The worst side effect of this to HBO is probably the cost of some unnecessary customers calling customer service confused.
The other side effect is a non-trivial number of people probably reported email from HBO Max as SPAM to Gmail etc. Speaking of which--I got this and, while I subscribed to HBO on Comcast for a while, I've never been an HBO Max customer.
"Judge on instances of success, but patterns of failure"
Some successes are obvious luck (e.g. winning the lottery). Some failures are the obvious result of bad choices or are so egregious they don't warrant generosity (e.g. cheating on a spouse).