← Back to context

Comment by awakeasleep

10 hours ago

Explain how fragility of implementation, like spaghetti code, high coupling low cohesion fit into your world view?

As human developers, I think we're struggling with "letting go" of the code. The code we write (or agents write) is really just an intermediate representation (IR) of the solution.

For instance, GCC will inline functions, unroll loops, and myriad other optimizations that we don't care about (and actually want!). But when we review the ASM that GCC generates we are not concerned with the "spaghetti" and the "high coupling" and "low cohesion". We care that it works, and is correct for what it is supposed to do.

Source code in a higher-level language is not really different anymore. Agents write the code, maybe we guide them on patterns and correct them when they are obviously wrong, but the code is just the work-item artifact that comes out of extensive specification, discussion, proposal review, and more review of the reviews.

A well-guided, iterative process and problem/solution description should be able to generate an equivalent implementation whether a human is writing the code or an agent.

  • A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent. It can do this because it is translating from one formal language to another.

    Translating a natural prompt on the other hand requires the LLM to make thousands of small decisions that will be different each time you regenerate the artifact. Even ignoring non-determinism, prompt instability means that any small change to the spec will result in a vastly different program.

    A natural language spec and test suite cannot be complete enough to encode all of these differences without being at least as complex as the code.

    Therefore each time you regenerate large sections of code without review, you will see scores of observable behavior differences that will surface to the user as churn, jank, and broken workflows.

    Your tests will not encode every user workflow, not even close. Ask yourself if you have ever worked on a non trivial piece of software where you could randomly regenerate 10% of the implementation while keeping to the spec without seeing a flurry of bug reports.

    This may change if LLMs improve such that they are able to reason about code changes to the degree a human can. As of today they cannot do this and require tests and human code review to prevent them from spinning out. But I suspect at that point they’ll be doing our job, as well as the CEOs and we’ll have bigger problems.

    • I don't see a world where a motivated soul can build a business from a laptop and a token service as a problem. I see it as opportunity.

      I feel similarly about Hollywood and the creation of media. We're not there in either case yet, but we will be. That's pretty clear. and when I look at the feudal society that is the entertainment industry here, I don't understand why so many of the serfs are trying to perpetuate it in its current state. And I really don't get why engineers think this technology is going to turn them into serfs unless they let that happen to them themselves. If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

      I am assuming given the rate of advance of AI coding systems in the past year that there is plenty of improvement to come before this plateaus. I'm sure that will include AI generated systems to do security reviews that will be at human or better level. I've already seen Claude find 20 plus-year-old bugs in my own code. They weren't particularly mission critical but they were there the whole time. I've also seen it do amazingly sophisticated reverse engineering of assembly code only to fall over flat on its face for the simplest tasks.

      25 replies →

    • As if when you delegate tasks to humans they are deterministic. I would hope that your test cases cover the requirements. If not, your implementation is just as brittle when other developers come online or even when you come back to a project after six months.

      1 reply →

  • Valid points. But crucial part of not "letting go" of the code is because we are responsible for that code at the moment.

    If, in the future, LLM providers will take ownership of our on-calls for the code they have produced, I would write "AUTO-REVIEW-ACCEPTER" bot to accept everything and deploy it to production.

    If, company requires me to own something, then I should be aware about what's that thing and understand ins and outs in detail and be able to quickly adjust when things go wrong

  • You are comparing compilers to a completely non deterministic code generation tool that often does not take observable behavior into account at all and will happily screw a part of your system without you noticing, because you misworded a single prompt.

    No amount of unit/integration tests cover every single use case in sufficiently complex software, so you cannot rely on that alone.

  • I've actually found that well-written well-documented non-spaghetti code is even more important now that we have LLMs.

    Why? Because LLMs can get easily confused, so they need well written code they can understand if the LLM is going to maintain the codebase it writes.

    The cleaner I keep my codebase, and the better (not necessarily more) abstracted it is, the easier it is for the LLM to understand the code within its limited context window. Good abstractions help the right level of understanding fit within the context window, etc.

    I would argue that use of LLMs change what good code is, since "good" now means you have to meaningfully fit good ideas in chunks of 125k tokens.

  • When requirements change, a compiler has the benefit of not having to go back and edit the binary it produced.

    Maybe we should treat LLM generated code similarly —- just generate everything fresh from the spec anytime there’a a change, though personally I haven’t had much success with that yet.

  • This is fantasy completely disconnected from reality.

    Have you ever tried writing tests for spaghetti code? It's hell compared to testing good code. LLMs require a very strong test harness or they're going to break things.

    Have you tried reading and understanding spaghetti code? How do you verify it does what you want, and none of what you don't want?

    Many code design techniques were created to make things easy for humans to understand. That understanding needs to be there whether you're modifying it yourself or reviewing the code.

    Developers are struggling because they know what happens when you have 100k lines of slop.

    If things keep speeding in this direction we're going to wake up to a world of pain in 3 years and AI isn't going to get us out of it.

    • I’ve found much more utility even pre AI in a good suite of integration tests than unit tests. For instance if you are doing a test harness for an API, it doesn’t matter if you even have access to the code if you are writing tests against the API surface itself.

      1 reply →

You did see the part about my unit, integration and scalability testing? The testing harness is what prevents the fragility.

It doesn’t matter to AI whether the code is spaghetti code or not. What you said was only important when humans were maintaining the code.

No human should ever be forced to look at the code behind my vibe coded internal admin portal that was created with straight Python, no frameworks, server side rendered and produced HTML and JS for the front end all hosted in a single Lambda including much of the backend API.

I haven’t done web development since 2002 with Classic ASP besides some copy and paste feature work once in a blue moon.

In my repos - post AI. My Claude/Agent files have summaries of the initial statement of work, the transcripts from the requirement sessions, my well labeled design diagrams , my design review sessions transcripts where I explained it to client and answered questions and a link to the Google NotebookLM project with all of the artifacts. I have separate md files for different implemtation components.

The NotebookLM project can be used for any future maintainers to ask questions about the project based on all of the artifacts.

  • > It doesn’t matter to AI whether the code is spaghetti code or not. What you said was only important when humans were maintaining the code.

    In my experience using AI to work on existing systems, the AI definitely performs much better on code that humans would consider readable.

    You can’t really sit here talking about architecting greenfield systems with AI using methodology that didn’t exist 6 months ago while confidently proclaiming that “trust me they’ll be maintainable”.

    Well you can, and most consultants do tend to do that, but it’s not worth much.

    • > Well you can, and most consultants do tend to do that

      Yeah they do.

      I'm familiar enough with the claims to feel confident there is plenty of nefarious astroturfing occurring all over the web including on HN.

    • I wasn’t born into consulting in 1996. AI for coding is by definition the worse today that it will ever be. What makes you think that the complexity of the code will increase faster than the capability of the agents?

      3 replies →

In my experience, consulting companies typically have a bunch of low-to-medium skilled developers producing crap, so the situation with AI isn't much different. Some are better than others, of course.

Also developer UX, common antipatterns, etc

This “the only thing that matters about code is whether it meets requirements” is such a tired take and I can’t imagine anyone seriously spouting it has has had to maintain real software.

  • The developer UX are the markdown files if no developer ever looks at the code.

    Whether you are tired of it or not, absolutely no one in your value you chain - your customers who give your company money or your management chain cares about your code beyond does it meet the functional and non functional requirements - they never did.

    And of course whether it was done on time and on budget

    • As a consumer of goods, I care quite a bit about many of the “hows” of those goods just as much as the “whats”.

      My home, which I own, for example, is very much a “what” that keeps me warm and dry. But the “how” of it was constructed is the difference between (1) me cursing the amateur and careless decision making of builders and (2) quietly sipping a cocktail on the beach, free of a care in the world.

      “How” doesn’t matter until it matters, like when you put too much weight onto that piece of particle board IKEA furniture.

      1 reply →

  • I personally haven't made my my mind either way yet, but I imagine that a vibecoding advocate could say to you that maintaining code makes sense only when the code is expensive to produce.

    If the code is cheap to produce, you don't maintain it, you just throw it away and regenerate.

    • If you have users, this only works if you have managed to encode nearly every user observable behavior into your test suite.

      I’ve never seen this done even with LLMs. Not even close. And even if you did it, the test suite is almost definitely more complex than the code and will suffer from all the same maintainability problems.