← Back to context

Comment by chasd00

1 month ago

Some stats are trickling out in my company. Code heavy consulting projects show about 18% efficiency gains but I have problems with that number because no one has been able to tell me how it was calculated. Story points actual vs estimated is probably how it was done but that’s nonsensical because we all know how subjective estimates and even actuals are. It’s probably impossible to get a real number that doesn’t have significant “well I feel about x% more efficient…”

More interesting imo would be a measure of maintainability. I've heard that code that's largely written by AI is rarely remembered by the engineer that submitted even a week after merging

You're almost "locked in" to using more AI on top of it then. It may also make it harder to give estimates to non-technical staff on how long it'd take to make a change or implement a new feature

  • I don’t know how to measure maintainability but the AI generated code I’ve seen in my projects is pretty plain vanilla standard patterns with comments. So less of a headache than a LOT of human code I’ve seen. Also, one thing the agents are good at, at least in my experience so far, is documenting existing code. This goes a long ways in maintenance, it’s not always perfect but as the saying goes documentation is like sex, when it’s good it’s great when it’s bad it’s better than nothing.

    • Something I occasionally do is ask it to extensively comment a section of code for me, and to tell me what it thinks the intent of the code was, which takes a lot of cognitive load off of me. It means I'm in the loop without shutting off my brain, as I do have to read the code and understand it, so I find it a sweet spot of LLM use.

    • by "maintainability" and "rarely remembered by the engineer" i'm assuming the bigger concern (beyond commenting and sane code) is once everyone starts producing tons of code without looking - and reading(reviewing) code is, to me at least, much harder than writing - then all of this goes unchecked:

      * subtle footguns

      * hallucinations

      * things that were poorly or incompletely expressed in the prompt and ended up implemented incorrectly

      * poor performance or security bugs

      other things (probably correctable by fine-tuning the prompt and the context):

      * lots of redundancy

      * comments that are insulting to the intelligence (e.g., "here we instantiate a class")

      * ...

      not to mention reduced human understanding of the system and where it might break or how this implementation is likely to behave. All of this will come back to bite during maintenance.

    • I find it funny that we, collectively, are now okay with comments in the code.

      I remember the general consensus on this _not even two years ago_ being that the code should speak for itself and that comments harm more than help.

      This matters less when agentic tools are doing the maintenance, I suppose, but the backslide in this practice is interesting.

      5 replies →

  • chasd00 did mention that this was for consulting projects, where presumably there's a handover to another team after a period of time. Maintainability was never a high priority for consultants.

    But in general I agree with your point.

  • > engineer that submitted it

    This is a poor metric as soon as you reach a scale where you've hired an additional engineer, where 10% annual employee turnover reflects > 1 employee, much less the scale where a layoff is possible.

    It's also only a hope as soon as you have dependencies that you don't directly manage like community libraries.

Hint: Make sure the people giving you the efficiency improvement numbers don't have a vested interest in giving you good numbers. If so, you can not trust the numbers.

Reminds me of my last job where the team that pushed React Native into the codebase were the ones providing the metrics for "how well" React Native was going. Ain't no chance they'd ever provide bad numbers.