← Back to context

Comment by bittermandel

4 days ago

> With DevOps, if you can automate it, you should automate it.

I couldn't agree less with this. At this point the whole "DevOps" industry is fueled by consultancies who make a great living from convincing business leaders that this is true. Focusing on defining clear processes for recurring events and building the fundamental building blocks that allows you to automate when it's absolutely needed should be the method, not spending more time writing Terraform.

> Focusing on defining clear processes for recurring events and building the fundamental building blocks that allows you to automate when it's absolutely needed should be the method, not spending more time writing Terraform.

For one, I don't really consider terraform "automation," and more IaC, but I'll digress - this is all well and good in mature organizations with robust processes and aligned leadership. In practice, however, and what I find most often, is you will find very small "devops" shops in companies that aren't necessarily "tech" sized 50-300 people with a devops team of 3-5 people (if they're lucky) that the organization, or sometimes even themselves, see as glorified IT sysadmins. They're always seen as an expense, usually critically understaffed, and if you leave teams like this to decide on their own what is "necessary" to automate you're going to get weird/misaligned/dysfunctional results, and even moreso if you let the business decide this, which is what usually happens, and they don't really give a crap if some poor former-sysadmin has to spend 12 hours a day clicking buttons in aws console as long as they get what they needed (actually have seen a guy making 150k to basically do just this).

So what happens, like I said, is teams get into this hell-loop of manual task after manual task, which not only requires large amounts of mental bandwidth to keep track of or keep up to date all the documentation or playbooks surrounding these manual tasks (if you're lucky to get even that), you have to deal with the inevitable mistakes and errors that are common when doing things strictly manually, which eats up a ton of unnecessary time and thus $$.

I agree though most devops consultants are terrible, and the industry is driven by this, however, this is the specific niche I've carved out for myself, coming in after big terrible crappy consultant that basically just pitches a brittle jenkins CI setup and some basic terraform and charges you $250k for their time. I actually really enjoy doing it too, and the challenges and issues are almost always unique to the org, even if the patterns are similar - so it's always interesting.

So, long story short, unless you have a super robust process and mature system, it's usually just a lot easier to default to "automate" and come up with reasonable exceptions when it doesn't make sense to do so, rather than the other way around.

Git ops (declarative infra and config) and containers, make automation really, really easy, and completely eliminate all sorts of classes of problems. We push code to prod dozens of times a day without any issues, for months and months at a time. Typically backed by only a single devops engineer to keep everything humming along or building new automation. The automation is the clear process, spitting out messages regularly to email or group chat somewhere. And provides the audit trail.

I worked at a traditional finance company and we had a team of 8 people in traditional operations and another 30 people doing manual testing around the clock to support about 20 developers, 10 network staff, plus another 20-30 managers or leads and security. We could only deploy once a week and there were always issues with "final check out" on sunday morning when hotfixes had to go in or config was modified.

  • , sorry to laugh but I’ve worked a lot of fintech and this is so painfully on point. I’ve also experienced the nirvana of a mature gitops system - getting there is really painful though (IMO)

    the funny thing is, the fintech company in the example you gave likely sees nothing wrong with this. I’ve seen cases where the release cycle is once a month or longer, similar team sizes, and they don’t think they have an issue and would probably laugh at you or look at you weird if you mentioned ci/cd.

True. That's why IT Services companies have such massive practices dedicated to DevOps. Its a great annuity business for them.

> not spending more time writing Terraform.

Also, there's another bit of nuance to that, as well as your overarching point about "automation isn't free," in that writing Terraform/Tofu isn't usually the long pole in that tent: debugging the raging PoS most certainly is (along with its associated https://xkcd.com/303/ of waiting for the "plan, attempt apply, puke, goto 1" loop)

And, in almost the exact same vein: writing any automation carries with it two downstream bits of work: monitoring the automation and having enough context to debug it when (WHEN) it falls over

  • People don’t know about https://dagger.io/, which solves this.

    • My experience has been it's not a lack of knowledge it's a combination of inertia, cargo culting, and give-a-shit

      There are so many great tools that solve so many problems but life is filled with trade-offs and many people don't value the same trade-offs that I do, so they just bash their head against Terraform (or $other_legacy_tool) because "it's what we use"

      I was really hoping that Earthly or Dagger were going to catch on due to the enormous number of folks that complain about not being able to run GitHub Actions (or GLCI) locally, on top of bitching about yaml alllllllllll the fucking time. But, same problem, IMHO: inertia is so strong

      6 replies →