← Back to context

Comment by Night_Thastus

14 hours ago

The problem with such infrastructure is not the initial development overhead.

It's the maintenance. The long term, slow burn, uninteresting work that must be done continually. Someone needs to be behind it for the long haul or it will never get adopted and used widely.

Right now, at least, LLMs are not great at that. They're great for quickly creating smaller projects. They get less good the older and larger those projects get.

I mean the claim is that next generation models are better and better at executing on larger context. I find that GPT 5.4 xhigh is surprisingly good at analysis even on larger codebases.

https://x.com/mitchellh/status/2029348087538565612

Stuff like this where these models are root causing nontrivial large scale bugs is already there in SOTA.

I would not be surprised if next generation models can both resolve those more reliability and implement them better. At that point would be sufficiently good maintainers.

They are suggesting that new models can chain multiple newly discovered vulnerabilities into RCE and privilege escalations etc. You can't do this without larger scope planning/understanding, not reliabily.