Comment by whynotminot
2 days ago
> The hard thing about engineering is not "building a thing that works", its building it the right way, in an easily understood way, in a way that's easily extensible.
You’re talking like in the year 2026 we’re still writing code for future humans to understand and improve.
I fear we are not doing that. Right now, Opus 4.5 is writing code that later Opus 5.0 will refactor and extend. And so on.
This sounds like magical thinking.
For one, there are objectively detrimental ways to organize code: tight coupling, lots of mutable shared state, etc. No matter who or what reads or writes the code, such code is more error-prone, and more brittle to handle.
Then, abstractions are tools to lower the cognitive load. Good abstractions reduce the total amount of code written, allow to reason about the code in terms of these abstractions, and do not leak in the area of their applicability. Say Sequence, or Future, or, well, function are examples of good abstractions. No matter what kind of cognitive process handles the code, it benefits from having to keep a smaller amount of context per task.
"Code structure does not matter, LLMs will handle it" sounds a bit like "Computer architectures don't matter, the Turing Machine is proved to be able to handle anything computable at all". No, these things matter if you care about resource consumption (aka cost) at the very least.
Yes LLMs aren't very good at architecture. I suspect because the average project online has pretty bad architecture. The training set is poisoned.
It's kind of bittersweet for me because I was dreaming of becoming a software architect when I graduated university and the role started disappearing so I never actually became one!
But the upside of this is that now LLMs suck at software architecture... Maybe companies will bring back the software architect role?
The training set has been totally poisoned from the architecture PoV. I don't think LLMs (as they are) will be able to learn software architecture now because the more time passes, the more poorly architected slop gets added online and finds its way into the training set.
Good software architecture tends to be additive, as opposed to subtractive. You start with a clean slate then build up from there.
It's almost impossible to start with a complete mess of spaghetti code and end up with a clean architecture... Spaghetti code abstractions tend to mislead you and lead you astray... It's like; understanding spaghetti code tends to soil your understanding of the problem domain. You start to think of everything in terms of terrible leaky abstraction and can't think of the problem clearly.
It's hard even for humans to look at a problem through fresh eyes; it's likely even harder for LLMs to do it. For example, if you use a word in a prompt, the LLM tends to try to incorporate that word into the solution... So if the AI sees a bunch of leaky abstractions in the code; it will tend to try to work with them as opposed to removing them and finding better abstractions. I see this all the time with hacks; if the code is full of hacks, then an LLM tends to produce hacks all the time and it's almost impossible to make it address root causes... Also hacks tend to beget more hacks.
Refactoring is a very mechanistic way of turning bad code into good. I don’t see a world in which our tools (LLMs or otherwise) don’t learn this.
1 reply →
> For one, there are objectively detrimental ways to organize code: tight coupling, lots of mutable shared state, etc. No matter who or what reads or writes the code, such code is more error-prone, and more brittle to handle.
Guess what, AIs don't like that as well because it makes harder for them to achieve the goal. So with minimal guidance, which at this point could probably be provided by AI as well, the output of AI agent is not that.
Opus 4.5 is writing code that Opus 5.0 will refactor and extend. And Opus 5.5 will take that code and rewrite it in C from the ground up. And Opus 6.0 will take that code and make it assembly. And Opus 7.0 will design its own CPU. And Opus 8.0 will make a factory for its own CPUs. And Opus 9.0 will populate mars. And Opus 10.0 will be able to achieve AGI. And Opus 11.0 will find God. And Opus 12.0 will make us a time machine. And so on.
Objectively, we are talking about systems that have gone from being cute toys to outmatching most juniors using only rigid and slow batch training cycles.
As soon as models have persistent memory for their own try/fail/succeed attempts, and can directly modify what's currently called their training data in real time, they're going to develop very, very quickly.
We may even be underestimating how quickly this will happen.
We're also underestimating how much more powerful they become if you give them analysis and documentation tasks referencing high quality software design principles before giving them code to write.
This is very much 1.0 tech. It's already scary smart compared to the median industry skill level.
The 2.0 version is going to be something else entirely.
Can't wait to see what Opus 13.0 does with the multiverse.
https://users.ece.cmu.edu/~gamvrosi/thelastq.html
Wake me up at Opus 12
Just one more OPUS bro.
Honestly the scary part is that we don’t really even need one more Opus. If all we had for the rest of our lives was Opus 4.5, the software engineering world would still radically change.
But there’s no sign of them slowing down.
I also love how AI enthusiasts just ignore the issue of exhausted training data... You cant just magically create more training data. Also synthetic training data reduces the quality of models.
Youre mixing up several concepts. Synthetic data works for coding because coding is a verifiable domain. You train via reinforcement learning to reward code generation behavior that passes detailed specs and meets other deseridata. It’s literally how things are done today and how progress gets made.
5 replies →
They don't ignore it, they just know it's not an actual problem.
It saddens me to see AI detractors being stuck in 2022 and still thinking language models are just regurgitating bits of training data.
4 replies →
That's been my main argument for why LLMs might be at their zenith. But I recently started wondering whether all those codebases we expose to them are maybe good enough training data for the next generation. It's not high quality like accepted stackoverflow answers but it's working software for the most part.
1 reply →
Up until now, no business has been built on tools and technology that no one understands. I expect that will continue.
Given that, I expect that, even if AI is writing all of the code, we will still need people around who understand it.
If AI can create and operate your entire business, your moat is nil. So, you not hiring software engineers does not matter, because you do not have a business.
> Up until now, no business has been built on tools and technology that no one understands. I expect that will continue.
Big claims here.
Did brewers and bakers up to the middle ages understand fermentation and how yeasts work?
They at least understood that it was something deterministic that they could reproduce.
That puts them ahead of the LLM crowd.
Does the corner bakery need a moat to be a business?
How many people understand the underlying operating system their code runs on? Can even read assembly or C?
Even before LLMs, there were plenty of copy-paste JS bootcamp grads that helped people build software businesses.
> Does the corner bakery need a moat to be a business?
Yes, actually. Its hard to open a competing bakery due to location availability, permitting, capex, and the difficulty of converting customers.
To add to that, food establishments generally exist on next to no margin, due to competition, despite all of that working in their favor.
Now imagine what the competitive landscape for that bakery would look like if all of that friction for new competitors disappeared. Margin would tend toward zero.
6 replies →
Most legacy apps are barely understood by anyone, and yet continue to generate value and and are (somehow) kept alive.
Many here have been doing the "understanding of legacy code" as a job +50 years.
This "legacy apps are barely understood by anybody", is just somnething you made up.
1 reply →
> no business has been built on tools and technology that no one understands
Well, there are quite a few common medications we don't really know how they work.
But I also think it can be a huge liability.
In my experience, using LLMs to code encouraged me to write better documentation, because I can get better results when I feed the documentation to the LLM.
Also, I've noticed failure modes in LLM coding agents when there is less clarity and more complexity in abstractions or APIs. It's actually made me consider simplifying APIs so that the LLMs can handle them better.
Though I agree that in specific cases what's helpful for the model and what's helpful for humans won't always overlap. Once I actually added some comments to a markdown file as note to the LLM that most human readers wouldn't see, with some more verbose examples.
I think one of the big problems in general with agents today is that if you run the agent long enough they tend to "go off the rails", so then you need to babysit them and intervene when they go off track.
I guess in modern parlance, maintaining a good codebase can be framed as part of a broader "context engineering" problem.
I've also noticed that going off the rails. At the start of a session, they're pretty sharp and focused, but the longer the session lasts, the more confused they get. At some point they start hallucinating bullshit that they wouldn't have earlier in the session.
It's a vital skill to recognise when that happens and start a new session.
We don't know what Opus 5.0 will be able to refactor.
If argument is "humans and Opus 4.5 cannot maintain this, but if requirements change we can vibe-code a new one from scratch", that's a coherent thesis, but people need to be explicit about this.
(Instead this feels like the mott that is retreated to, and the bailey is essentially "who cares, we'll figure out what to do with our fresh slop later".)
Ironically, I've been Claude to be really good at refactors, but these are refactors I choose very explicitly. (Such as I start the thing manually, then let it finish.) (For an example of it, see me force-pushing to https://github.com/NixOS/nix/pull/14863 implementing my own code review.)
But I suspect this is not what people want. To actually fire devs and not rely on from-scratch vibe-coding, we need to figure out which refactors to attempt in order to implement a given feature well.
That's a very creative open-ended question that I haven't even tried to let the LLMs take a crack at it, because why I would I? I'm plenty fast being the "ideas guy". If the LLM had better ideas than me, how would I even know? I'm either very arrogant or very good because I cannot recall regretting one of my refactors, at least not one I didn't back out of immediately.
This is the question! Your narrative is definitely plausible, and I won't be shocked if it turns out this way. But it still isn't my expectation. It wasn't when people were saying this in 2023 or in 2024, and I haven't been wrong yet. It does seem more likely to me now than it did a couple years ago, but still not the likeliest outcome in the next few years.
But nobody knows for sure!
Yeah, I might be early to this. And certainly, I still read a lot of code in my day to day right now.
But I sure write a lot less of it, and the percentage I write continues to go down with every new model release. And if I'm no longer writing it, and the person who works on it after me isn't writing it either, it changes the whole art of software engineering.
I used to spend a great deal of time with already working code that I had written thinking about how to rewrite it better, so that the person after me would have a good clean idea of what is going on.
But humans aren't working in the repos as much now. I think it's just a matter of time before the models are writing code essentially for their eyes, their affordances -- not ours.
Yeah we're not too far from agreement here.
Something I think though (which, again, I could very well be wrong about; uncertainty is the only certainly right now) is that "so the person after me would have a good clean idea of what is going on" is also going to continue mattering even when that "person" is often an AI. It might be different, clarity might mean something totally different for AIs than for humans, but right now I think a good expectation is that clarity to humans is also useful to AIs. So at the moment I still spend time coaxing the AI to write things clearly.
That could turn out to be wasted time, but who knows. I also think if it as a hedge against the risk that we hit some point where the AIs turn out to be bad at maintaining their own crap, at which point it would be good for me to be able to understand and work with what has been written!
Refactoring does always cost something and I doubt LLMs will ever change that. The more interesting question is whether the cost to refactor or "rewrite" the software will ever become negligible. Until it isn't, it's short-sighted to write code in the manner you're describing. If software does become that cheap, then you can't meaningfully maintain a business on selling software anyway.
Yeah I think it's a mistake to focus on writing "readable" or even "maintainable" code. We need to let go of these aging paradigms and be open to adopting a new one.
In my experience, LLMs perform significantly better on readable maintainable code.
It's what they were trained on after-all.
However what they produce is often highly readable but not very maintainable due to the verbosity and obvious comments. This seems to pollute codebases over time and you see AI coding efficiency slowly decline.
> Poe's law is an adage of Internet culture which says that any parodic or sarcastic expression of extreme views can be mistaken for a sincere expression of those views.
The things you mentioned are important but have been on their way out for years now regardless of LLMs. Have my ambivalent upvote regardless.
[1] https://en.wikipedia.org/wiki/Poe%27s_law
as depressing as it is to say, i think it's a bit like the year is 1906 and we're complaining that these new tyres for cars they're making are bad because they're no longer backwards compatible with the horse drawn wagons we might want to attach them to in the future.
Yes, exactly.
This is a completely new thing which will have transformative consequences.
It's not just a way to do what you've always done a bit more quickly.
Do readability and maintainability not matter when AI "reads" and maintains the code? I'm pretty sure they do.
If that would be true, you could surely ask an LLM to write the same complexity apps in brainfuck, right?