Comment by Twey
10 hours ago
This is the best explanation of (my take on) this I've seen so far.
On top of the article's excellent breakdown of what is happening, I think it's important to note a couple of driving factors about why (I posit) it is happening:
First, and this is touched upon in the OP but I think could be made more explicit, a lot of people who bemoan the existence of software development as a discipline see it as a morass of incidental complexity. This is significantly an instance of Chesterton's Fence. Yes, there certainly is incidental complexity in software development, or at least complexity that is incidental at the level of abstraction that most corporate software lives at. But as a discipline, we're pretty good at eliminating it when we find it, though it sometimes takes a while — but the speed with which we iterate means we eliminate it a lot faster than most other disciplines. A lot of the complexity that remains is actually irreducible, or at least we don't yet know how to reduce it. A case in point: programming language syntax. To the outsider, the syntax of modern programming languages, where the commas go, whether whitespace means anything, how angle brackets are parsed, looks to the uninitiated like a jumble of arcane nonsense that must be memorized in order to start really solving problems, and indeed it's a real barrier to entry that non-developers, budding developers, and sometimes seasoned developers have to contend with. But it's also (a selection of competing frontiers of) the best language we have, after many generations of rationalistic and empirical refinement, for humans to unambiguously specify what they mean at the semantic level of software development as it stands! For a long time now we haven't been constrained in the domain of programming language syntax by the complexity or performance of parser implementations. Instead, modern programming languages tend toward simpler formal grammars because they make it easier for _humans_ to understand what's going on when reading the code. AI tools promise to (amongst other things; don't come at me AI enthusiasts!) replace programming language syntax with natural language. But actually natural language is a terrible syntax for clearly and unambiguously conveying intent! If you want a more venerable example, just look at mathematical syntax, a language that has never been constrained by computer implementation but was developed by humans for humans to read and write their meaning in subtle domains efficiently and effectively. Mathematicians started with natural language and, through a long process of iteration, came to modern-day mathematical syntax. There's no push to replace mathematical syntax with natural language because, even though that would definitely make some parts of the mathematical process easier, we've discovered through hard experience that it makes the process as a whole much harder.
Second, humans (as a gestalt, not necessarily as individuals) always operate at the maximum feasible level of complexity, because there are benefits to be extracted from the higher complexity levels and if we are operating below our maximum complexity budget we're leaving those benefits on the table. From time to time we really do manage to hop up the ladder of abstraction, at least as far as mainstream development goes. But the complexity budget we save by no longer needing to worry about the details we've abstracted over immediately gets reallocated to the upper abstraction levels, providing things like development velocity, correctness guarantees, or UX sophistication. This implies that the sum total of complexity involved in software development will always remain roughly constant. This is of course a win, as we can produce more/better software (assuming we really have abstracted over those low-level details and they're not waiting for the right time to leak through into our nice clean abstraction layer and bite us…), but as a process it will never reduce the total amount of ‘software development’ work to be done, whatever kinds of complexity that may come to comprise. In fact, anecdotally it seems to be subject to some kind of Braess' paradox: the more software we build, the more our society runs on software, the higher the demand for software becomes. If you think about it, this is actually quite a natural consequence of the ‘constant complexity budget’ idea. As we know, software is made of decisions (https://siderea.dreamwidth.org/1219758.html), and the more ‘manual’ labour we free up at the bottom of the stack the more we free up complexity budget to be spent on the high-level decisions at the top. But there's no cap on decision-making! If you ever find yourself with spare complexity budget left over after making all your decisions you can always use it to make decisions about how you make decisions, ad infinitum, and yesterday's high-level decisions become today's menial labour. The only way out of that cycle is to develop intelligences (software, hardware, wetware…) that can not only reason better at a particular level of abstraction than humans but also climb the ladder faster than humanity as a whole — singularity, to use a slightly out-of-vogue term. If we as a species fall off the bottom of the complexity window then there will no longer be a productivity-driven incentive to ideate, though I rather look forward to a luxury-goods market of all-organic artisanal ideas :)
I don't even think that "singularity-level coding agents" get us there. A big part of engineering is working with PMs, working with management, working across teams, working with users, to help distill their disparate wants and needs down into a coherent and usable system.
Knowing when to push back, when to trim down a requirement, when to replace a requirement with something slightly different, when to expand a requirement because you're aware of multiple distinct use cases to which it could apply, or even a new requirement that's interesting enough that it might warrant updating your "vision" for the product itself: that's the real engineering work that even a "singularity-level coding agent" alone could not replace.
An AI agent almost universally says "yes" to everything. They have to! If OpenAI starts selling tools that refuse to do what you tell them, who would ever buy them? And maybe that's the fundamental distinction. Something that says "yes" to everything isn't a partner, it's a tool, and a tool can't replace a partner by itself.
I think that's exactly an example of climbing the abstraction ladder. An agent that's incapable of reframing the current context, given a bad task, will try its best to complete it. An agent capable of generalizing to an overarching goal can figure out when the current objective is at odds with the more important goal.
You're correct in that these aren't really ‘coding agents’ any more, though. Any more than software developers are!
Not just the abstraction ladder though. Also the situational awareness ladder, the functionality ladder, and most importantly the trust ladder.
I can kind of trust the thing to make code changes because the task is fairly well-defined, and there are compile errors, unit tests, code reviews, and other gating factors to catch mistakes. As you move up the abstraction ladder though, how do I know that this thing is actually making sound decisions versus spitting out well-formatted AIorrhea?
At the very least, they need additional functionality to sit in on and contribute to meetings, write up docs and comment threads, ping relevant people on chat when something changes, and set up meetings to resolve conflicts or uncertainties, and generally understand their role, the people they work with and their roles, levels, and idiosyncrasies, the relative importance and idiosyncrasies of different partners, the exceptions for supposed invariants and why they exist and what it implies and when they shouldn't be used, when to escalate vs when to decide vs when to defer vs when to chew on it for a few days as it's doing other things, etc.
For example, say you have an authz system and you've got three partners requesting three different features, the combination of which would create an easily identifiable and easily attackable authz back door. Unless you specifically ask AI to look for this, it'll happily implement those three features and sink your company. You can't fault it: it did everything you asked. You just trusted it with an implicit requirement that it didn't meet. It wasn't "situationally aware" enough to read between the lines there. What you really want is something that would preemptively identify the conflicts, schedule meetings with the different parties, get a better understanding of what each request is trying to unblock, and ideally distill everything down into a single feature that unblocks them all. You can't just move up the abstraction ladder without moving up all those other ladders as well.
Maybe that's possible someday, but right now they're still just okay coders with no understanding of anything beyond the task you just gave them to do. That's fine for single-person hobby projects, but it'll be a while before we see them replacing engineers in the business world.
I’ve been creating agents to better navigate the early stages of SDLC. Here’s my findings of the last 12 months:
I Built A Team of AI Agents To Perform Business Analysis
https://bettersoftware.uk/2026/01/17/i-built-a-team-of-ai-ag...
> don't come at me AI enthusiasts!
no need to worry; none of them know how to read well enough to make it this far into your comment
Actually they're the only ones who do: copy and paste into chatgpt with "distill this please".
Twitter has a lot to answer for!