← Back to context

Comment by Night_Thastus

17 hours ago

LLMs hold some real utility. But that real utility is buried under a mountain of fake hype and over-promises to keep shareholder value high.

LLMs have real limitations that aren't going away any time soon - not until we move to a new technology fundamentally different and separate from them - sharing almost nothing in common. There's a lot of 'progress-washing' going on where people claim that these shortfalls will magically disappear if we throw enough data and compute at it when they clearly will not.

Pretty much. What actually exists is very impressive. But what was promised and marketed has not been delivered.

  • I think the missing ingredient is not something the LLMs lack, but something we as developers don't do - we need to constrain, channel, and guide agents by creating reactive test environments around them. Not vibes, but hard tests, they are the missing ingredient to coding agents. You can even use AI to write most of these tests but the end result depends on how well you structured your code to be testable.

    If you inherit 9000 tests from an existing project you can vibe code a replacement on your phone in a holiday, like Simon Willison's JustHTML port. We are moving from agents semi-randomly flailing around to constraint satisfaction.

  • Yes and most of the investment has been kind of post-GPT4 betting that things will get exponentially more impressive

  • I find opus 4.5 and gpt 5.2 mind blowing more often than I find them dumb as rocks. I don’t listen to or read any marketing material, I just use the tools. I couldn’t care less about what the promises are, what I have now available to me is fundamentally different from what I had in August and it changed completely how I work.

  • Markets never deliver. That isnt new, i do think llms are not far off from google in terms of impact.

    Search, as of today, is inferior to frontier models as a product. However, best case still misses expected returns by miles which is where the growsing comes from.

    Generative art/ai is still up in the air for staying power but id predict it isnt going away.