← Back to context

Comment by multisport

3 days ago

Yes agreed, and tbh even if that thesis is wrong, what does it matter?

in my experience, what happens is the code base starts to collapse under its own weight. it becomes impossible to fix one thing without breaking another. the coding agent fails to recognize the global scope of the problem and tries local fixes over and over. progress gets slower, new features cost more. all the same problems faced by an inexperienced developer on a greenfield project!

has your experience been otherwise?

  • Right, I am a daily user of agentic LLM tools and have this exact problem in one large project that has complex business logic externally dictated by real world requirements out of my control, and let's say, variable quality of legacy code.

    I remember when Gemini Pro 3 was the latest hotness and I started to get FOMO seeing demos on X posted to HN showing it one shot-ing all sorts of impressive stuff. So I tried it out for a couple days in Gemini CLI/OpenCode and ran into the exact same pain points I was dealing with using CC/Codex.

    Flashy one shot demos of greenfield prompts are a natural hype magnet so get lots of attention, but in my experience aren't particularly useful for evaluating value in complex, legacy projects with tightly bounded requirements that can't be easily reduced to a page or two of prose for a prompt.

    • To be fair, you're not supposed to be doing the "one shot" thing with LLMs in a mature codebase.

      You have to supply it the right context with a well formed prompt, get a plan, then execute and do some cleanup.

      LLMs are only as good as the engineers using them, you need to master the tool first before you can be productive with it.

      1 reply →

    • I would be much more impressed with implementing new, long-requested features into existing software (that are open to later maintain LLM-generated code).

      1 reply →

  • Adding capacity to software engineering through LLMs is like adding lanes to a highway — all the new capacity will be utilized.

    By getting the LLM to keep changes minimal I’m able to keep quality high while increasing velocity to the point where productivity is limited by my review bandwidth.

    I do not fear competition from junior engineers or non-technical people wielding poorly-guided LLMs for sustained development. Nor for prototyping or one offs, for that matter — I’m confident about knowing what to ask for from the LLM and how to ask.

  • This is relatively easily fixed with increasing test coverage to near 100% and lifting critical components into model checker space; both approaches were prohibitively expensive before November. They’ll be accepted best practices by the summer.

  • No that has certainly been my experience, but what is going to be the forcing function after a company decides it needs less engineers to go back to hiring?

  • Why not have the LLM rewrite the entire codebase?

    • In ~25 years or so of dealing with large, existing codebases, I've seen time and time again that there's a ton of business value and domain knowledge locked up inside all of that "messy" code. Weird edge cases that weren't well covered in the design, defensive checks and data validations, bolted-on extensions and integrations, etc., etc.

      "Just rewrite it" is usually -- not always, but _usually_ -- a sure path to a long, painful migration that usually ends up not quite reproducing the old features/capabilities and adding new bugs and edge cases along the way.

      12 replies →

The whole point of good engineering was not about just hitting the hard specs, but also have extendable, readable, maintainable code.

But if today it’s so cheap to generate new code that meets updated specs, why care about the quality of the code itself?

Maybe the engineering work today is to review specs and tests and let LLMs do whatever behind the scenes to hit the specs. If the specs change, just start from scratch.

  • "Write the specs and let the outsourced labor hit them" is not a new tale.

    Let's assume the LLM agents can write tests for, and hit, specs better and cheaper than the outsourced offshore teams could.

    So let's assume now you can have a working product that hits your spec without understanding the code. How many bugs and security vulnerabilities have slipped through "well tested" code because of edge cases of certain input/state combinations? Ok, throw an LLM at the codebase to scan for vulnerabilities; ok, throw another one at it to ensure no nasty side effects of the changes that one made; ok, add some functionality and a new set of tests and let it churn through a bunch of gross code changes needed to bolt that functionality into the pile of spaghetti...

    How long do you want your critical business logic relying on not-understood code with "100% coverage" (of lines of code and spec'd features) but super-low coverage of actual possible combinations of input+machine+system state? How big can that codebase get before "rewrite the entire world to pass all the existing specs and tests" starts getting very very very slow?

    We've learned MANY hard lessons about security, extensibility, and maintainability of multi-million-LOC-or-larger long-lived business systems and those don't go away just because you're no longer reading the code that's making you the money. They might even get more urgent. Is there perhaps a reason Google and Amazon didn't just hire 10x the number of people at 1/10th the salary to replace the vast majority of their engineering teams year ago?

  •   > let LLMs do whatever behind the scenes to hit the specs
    

    assuming for the sake of argument that's completely true, then what happens to "competitive advantage" in this scenario?

    it gets me thinking: if anyone can vibe from spec, whats stopping company a (or even user a) from telling an llm agent "duplicate every aspect of this service in python and deploy it to my aws account xyz"...

    in that scenario, why even have companies?

    • It’s all fun and games vibecoding until you A) have customers who depend on your product B) it breaks or the one person prompting and has access to the servers and api keys gets incapacited (or just bored).

      Sure we can vibecode oneoff projects that does something useful (my fav is browser extensions) but as soon as we ask others to use our code on a regular basis the technical debt clock starts running. And we all know how fast dependencies in a project breaks.

    • You can do this for many things now.

      Walmart, McDonalds, Nike - none really have any secrets about what they do. There is nothing stopping someone from copying them - except that businesses are big, unwieldy things.

      When software becomes cheap companies compete on their support. We see this for Open Source software now.

      4 replies →