Comment by def8cefe

8 years ago

> No one was able to stop the machine because no one really knew all the different processes or they weren't built to stop midway.

This is (in my opinion) one of the reasons why everyone talking about getting rid of dedicated IT ops in their organizations is making a mistake. You can have devs building integrations and automation all day, but you still need sysadmins who can see the whole picture and override them when necessary. Having an outsourced (or even internal) hell desk that goes off a script doesn't take care of situations like this either.

Don't worry, you can still pay outlandish prices for top tier A-Z support, where you can argue for a week with a drone that no, their system is not properly following the HTTP RFC, and here's the TCP dumps to prove it.

Imagine if we didn't have an operations engineer who knew how to trace down the problem using a TCP dump. Or if it had been a developer who was, by the proper "DevOps" hygenics, denied access to the boxes to run TCP dumps.

Yeah, they're fixing it. No, we don't have a timeline.

  • The worst bit is that someone internal to that business has probably also noticed that the system isn't properly following the HTTP RFC and is fighting the machine themselves to get it fixed with similar results.

I had a situation once, when Symantec firewall stopped working on my company laptop. Not the whole SEP suite, not the antivirus, just the firewall part. After 2 months I got my first angry email from robot, then I started getting them every month with big red text trying to scare me with every possible corporate hell, and it gradually started to CC it to higher and higher management in the company. Meanwhile nobody could fix it and it didn't help that IT support was outsourced to another country (I called them several times with zero results). At some point said managers started visiting me personally (because I assume their inaction was escalated even further). In the end nobody fixed this specifically, but I managed to accidentally do it when fixing virtualbox install - after increasing some obscure "Network Filter" limit in Windows both SEP and VB started working again.

I think wasn't auto fired like OP only because such trigger wasn't implemented in the company, yet. No IT support available can be really bad.

Every so often, I see a 'brilliant' claim like "99% of what IT does is routine, so it should be outsourced or even automated!"

The percentage is debatable, but the more important issue is what that 1% looks like. Because it's definitely not 1% of their value, it's the high-stakes stuff that needs a context-aware human to apply a fix or good decision in place of whatever The Machine is trying to do.

(And if you're going to keep experts on full time to handle the occasional 1% case, there's no longer much reason to outsource everything else.)