Comment by gentooflux

1 month ago

If it empirically works, then sure. If instead every single solution it provides beyond a few trivial lines falls somewhere between "just a little bit off" and "relies entirely on core library functionality that doesn't actually exist" then I'd say it does matter and it's only slightly better than an opaque box that spouts random nonsense (which will soon include ads).

3 comments

gentooflux

simonw 1 month ago

Those are 2024-era criticisms of LLMs for code.

Late 2025 models very rarely hallucinate nonexistent core library functionality - and they run inside coding agent harnesses so if they DO they notice that the code doesn't work and fix it.

mrwrong 1 month ago

get ready to tick those numbers over to 2026!

theshrike79 1 month ago

This sounds like you're copy-pasting code from ChatGPT's web interface, which is very 2024.

Agentic LLMs will notice if something is crap and won't compile and will retry, use the tools they have available to figure out what's the correct way, edit and retry again.