Comment by randallsquared

17 hours ago

> In a year or so

Look at the best models from Spring 2025, and compare with now (and similarly for Springs 2024 and 2025). Armstrong and lots of others are betting that this trend will continue, and if it does, the LLMs will ship code the LLMs understand, and whether any human specifically understands any particular part will mostly not matter.

> the LLMs will ship code the LLMs understand, and whether any human specifically understands any particular part will mostly not matter.

I find this particularly funny. There were more than a couple Star Trek Episodes where some alien planet depends on some advanced AI or other technology that they no longer understand, and it turns out the AI is actually slowly killing them, making them sterile, etc. (e.g. https://en.wikipedia.org/wiki/When_the_Bough_Breaks_(Star_Tr... )

Sure, Star Trek is fiction, but "humans rely on a technology that they forget how to make" is a pretty recurrent theme in human history. The FOGBANK saga was pretty recent: https://en.wikipedia.org/wiki/Fogbank

It just amazes me that people think "Sure, this AI generated code is kinda broken now, but all we need is just more AI code to fix it at some unknowable point in the future because humans won't be able to understand it!"

  • If you'd told me 20-30 years ago we'd actually get the Star Trek computer in the mid-2020s and it still wouldn't be actually AGI, I would have thought that very strange and unlikely, so who knows?

  • So nothing about the last 3 years has caused you to update your beliefs on this stuff? feels like bitter cope

And if the trend doesn't continue? I understand that a company with Coinbase's performance has little to lose and not many options, but many companies are in a better position.

The problem is that executives could take the 15-20% productivity boost and be content, but they read stuff like this, get greedy, and they don't understand the risk they're taking.

  • Even if the trend doesn’t continue, the current models are very very good. They’re better than the average programmer in the industry, already.

    • I don't know how anyone who carefully and closely reviews their output could possibly think that. Much of the time their code is fine, but every now and again they make a catastrophic (though often well-hidden) mistake that is so bad that all the tests pass but the codebase will be bricked if enough of those go in. They make such disastrous mistakes frequently enough that a decent-sized codebase can't last for more than 18-24 months.

      If the average programmer is this bad, then there must be better-than-average programmers reviewing the code. The problem with agents is that they can produce code at a far higher volume than the average programmer.

      Anyway, I don't know how well the average programmer programs, but if you commit agent-generated code without careful review, your codebase will be cooked in a year or two.

> and whether any human specifically understands any particular part will mostly not matter.

This is how I feel. It’s building things for me that work. I don’t care how it works under the hood in many cases.

  • It's not about caring how it works. It's about caring that it keeps working at all even after you add stuff to it for a year or three (and nearly all software written by companies is software they evolve).

    • And who’s to say it won’t? It’s working now. I’m adding stuff and it’s still working. Why won’t that continue in year 3?

      6 replies →