← Back to context

Comment by MattPalmer1086

4 years ago

Yep, the curse of doing things right. You can't prove it was needed.

You need to demonstrate that you are fixing genuine problems, or you will eventually be replaced by someone who delivers faster, even if there are subsequent bugs.

One way to do this is to negotiate with the business in what needs doing, using risk. If you think there is a risk of a security or stability issue then you should be able to assess that risk. The business can then choose to accept the risk and add some features, or fix the risk. It is essential that the owner of the system officially accepts the risks presented. You cannot own the risks.

This lets the business prioritise the work according to its risk appetite. And if the shit hits the fan, you are not only covered but your reputation will increase.

While this works with rational actors the experience I have had in the industry is often the opposite. In fact, the company I work for now is probably the only company I've worked for in the last decade that actually correctly evaluates risk. The average corporate drone overseeing the engineer org is very typically the least rational actor in the entire org.

Given the opportunity most start-up and mid-tier business will prioritize speed over safety. Despite my many attempts to explain this trade off using various methods such as engineer-speak, business-speak, or some combination of the two the need for money and the need to constantly impress investors trumps all. I have quite literally told people the total cost of a half-fix will be more than double the cost in engineering hours to implement a correct fix and by-and-large the half-fix will be chosen because it "gets the feature out to users quicker". It's the most asinine thing I've heard and I fully understand the need to deliver on time and on budget.

In the end your ass is never covered. It will be your fault whether you suggested to do it and they said no, or they said yes. Your team will end up working the long hours to implement the obvious security and safety changes. The math for the other side is simple, if the cost to take on the risk is less than the cost to implement the fix, it will never get done. Companies use pager duty for free labor for a reason. It's the industry's most effective permitter of poor practices.

Sure, something as simple as "we should really hash our passwords" might be so glaringly obvious even the most dense business person would understand. But when you wander into the land of ambiguity is when you really get burned. When the company is spending $XX,XXX/mo. on cloud storage because the ticket specifically said to not worry about lifecycle it's going to be you in the office explaining why this wasn't fixed. Rarely will any business person take "its your fault" as the answer. They'll happily assign you as many 60 hour weeks as you need to fix the problem and in a large enough corporate-tier screw up you may be the sacrificial lamb for the investors to feel like "the problem was solved".

Call me cynical but this is an unwinnable battle. Unfortunately, until software bugs start literally killing people, the desire to actually allow engineers to do their job will be low.

  • And that is as it should be (except the blaming for issues you raised).

    The business should be able to choose what the priorities are. The business does not exist to produce beautiful code.

    If a business wilfully disregards security or stability risks that they were informed of, and they get bitten, then they will almost certainly end up paying more to fix in resource and engineering time. But that's the trade off they chose to make.

    If a business plays the blame game here, it's simply time to find another job. They are not going to be a good place to work at all.

    • As an addendum, I've been on both sides of this argument. When I became the chief technical architect for a software company, I had to tell my colleagues to build a huge pile of crap to hit important commercial milestones and keep customers happy.

      I pointed out that the total cost of doing this badly, then unwinding it and doing it properly would be at least double the total cost of just doing it right.

      That didn't matter. We had to hit those goals. So that's what we did. It was expensive, buggy and kept us in business. My colleagues finally understood I had turned to the dark side :)