Comment by mjburgess
19 days ago
The relevant scale is the number of hard constraints on the solution code, not the size of task as measured by "hours it would take the median programmer to write".
So eg., one line of code which needed to handle dozens of hard-constraints on the system (eg., using a specific class, method, with a specific device, specific memory management, etc.) will very rarely be output correctly by an LLM.
Likewise "blank-page, vibe coding" can be very fast if "make me X" has only functional/soft-constraints on the code itself.
"Gigawatt LLMs" have brute-forced there way to having a statistical system capable of usefully, if not universally, adhreading to one or two hard constraints. I'd imagine the dozen or so common in any existing application is well beyond a Terawatt range of training and inference cost.
Keep in mind that the model of using LLM assumes the underlying dataset converges to production ready code. Thats never been proven, cause we know they scraped sourcs code without attribution.