Comment by crystal_revenge

19 hours ago

> I don't understand why Hacker News is so dismissive about the coming of LLMs

I find LLMs incredibly useful, but if you were following along the last few years the promise was for “exponential progress” with a teaser world destroying super intelligence.

We objectively are not on that path. There is no “coming of LLMs”. We might get some incremental improvement, but we’re very clearly seeing sigmoid progress.

I can’t speak for everyone, but I’m tired of hyperbolic rants that are unquestionably not justified (the nice thing about exponential progress is you don’t need to argue about it)

I'm not sure I understand: we are _objectively on that path_ -- we are increasing exponentially on a number of metrics that may be imperfect but seem to paint a pretty consistent picture. Scaling laws are exponential. METR's time horizon benchmark is exponential. Lots of performance measures are exponential, so why do you say we're objectively not on that path?

> We might get some incremental improvement, but we’re very clearly seeing sigmoid progress.

again, if it is "very clear" can you point to some concrete examples to illustrate what you mean?

> I can’t speak for everyone, but I’m tired of hyperbolic rants that are unquestionably not justified (the nice thing about exponential progress is you don’t need to argue about it)

OK but what specifically do you have an issue with here?

> exponential progress

First you need to define what it means. What's the metric? Otherwise it's very much something you can argue about.

  • Time spent being human and enjoying life.

    I can’t point at many problems it has meaningfully solved for me. I mean real problems , not tasks that I have to do for my employer. It seems like it just made parts of my existence more miserable, poisoned many of the things I love, and generally made the future feel a lot less certain.

  • > What's the metric?

    Language model capability at generating text output.

    The model progress this year has been a lot of:

    - “We added multimodal”

    - “We added a lot of non AI tooling” (ie agents)

    - “We put more compute into inference” (ie thinking mode)

    So yes, there is still rapid progress, but these ^ make it clear, at least to me, that next gen models are significantly harder to build.

    Simultaneously we see a distinct narrowing between players (openai, deepseek, mistral, google, anthropic) in their offerings.

    Thats usually a signal that the rate of progress is slowing.

    Remind me what was so great about gpt 5? How about gpt4 from from gpt 3?

    Do you even remember the releases? Yeah. I dont. I had to look it up.

    Just another model with more or less the same capabilities.

    “Mixed reception”

    That is not what exponential progress looks like, by any measure.

    The progress this year has been in the tooling around the models, smaller faster models with similar capabilities. Multimodal add ons that no one asked for, because its easier to add image and audio processing than improve text handling.

    That may still be on a path to AGI, but it not an exponential path to it.

    • I don’t think the path was ever exponential but your claim here is almost as if the slow down hit an asymptote like wall.

      Most of the improvements are intangible. Can we truly say how much more reliable the models are? We barely have quantitative measurements on this so it’s all vibes and feels. We don’t even have a baseline metric for what AGI is and we invalidated the Turing test also based on vibes and feels.

      So my argument is that part of the slow down is in itself an hallucination because the improvement is not actually measurable or definable outside of vibes.

    • > Language model capability at generating text output.

      That's not a metric, that's a vague non-operationalized concept, that could be operationalized into an infinite number of different metrics. And an improvement that was linear in one of those possible metrics would be exponential in another one (well, actually, one that is was linear in one would also be linear in an infinite number of others, as well as being exponential in an infinite number of others.

      That’s why you have to define an actual metric, not simply describe a vague concept of a kind of capacity of interest, before you can meaningfully discuss whether improvement is exponential. Because the answer is necessarily entirely dependent on the specific construction of the metric.

    • > Language model capability at generating text output.

      That's not a quantifiable sentence. Unless you put it in numbers, anyone can argue exponential/not.

      > next gen models are significantly harder to build.

      That's not how we judge capability progress though.

      > Remind me what was so great about gpt 5? How about gpt4 from from gpt 3?

      > Do you even remember the releases?

      At gpt 3 level we could generate some reasonable code blocks / tiny features. (An example shown around at the time was "explain what this function does" for a "fib(n)") At gpt 4, we could build features and tiny apps. At gpt 5, you can often one-shot build whole apps from a vague description. The difference between them is massive for coding capabilities. Sorry, but if you can't remember that massive change... why are you making claims about the progress in capabilities?

      > Multimodal add ons that no one asked for

      Not only does multimodal input training improve the model overall, it's useful for (for example) feeding back screenshots during development.

  • Define it however you like. There's not a single chart you can draw that even begins to look like a signoid.

> but we’re very clearly seeing sigmoid progress.

Yeah, probably. But no chart actually shows it yet. For now we are firmly in exponential zone of the signoid curve and can't really tell if it's going to end in a year, decade or a century.

  • Doesn't even matter if the goal is extremely high. Talking about exponential when we clearly see matching energy needs proves there is no way we can maintain that pace without radical (and thus unpredictable) improvements.

    My own "feeling" is that it's definitely not exponential but again, doesn't matter if it's unsustainable.

I’ve been reading this comment multiple times a week for the last couple years. Constant assertions that we’re starting to hit limits, plateau, etc. But a cursory glance at where we are today vs a year ago, let alone two years ago, makes it wildly obvious that this is bullshit. The pace of improvement of both models and tooling has been breathtaking. I could give a shit whether you think it’s “exponential”, people like you were dismissing all of this years ago, meanwhile I just keep getting more and more productive.

  • People keep saying stuff like this. That the improvements are so obvious and breathtaking and astronomical and then I go check out the frontier LLMs again and they're maybe a tiny bit better than they were last year but I can't actually be sure bcuz it's hard to tell.

    sometimes it seems like people are just living in another timeline.

We're very clearly seeing exponential progress - even above trend, on METR, whose slope keeps getting revised to a higher and higher estimate each time. Explain your perspective on the objective evidence against exponential progress?

  • Pretty neat how this exponential progress hasn't resulted in exponential productivity. Perhaps you could explain your perspective on that?

    • Because that requires adoption. Devs on hackernews are already the most up to date folks in the industry and even here adoption of LLMs is incredibly slow. And a lot of the adoption that does happen is still with older tech like ChatGPT or Cursor.

      2 replies →

    • Writing the code itself was never the main bottleneck. Designing the bigger solution, figuring out tradeoffs, taking to affected teams, etc. takes as much time as it used to. But still, there's definitely a significant improvement in code production part in many areas.

    • I think this is an open question still and very interesting. Ilya discussed this on the Dwarkesh podcast. But the capabilities of LLMs is clearly exponential and perhaps super exponential. We went from something that could string together incoherent text in 2022 to general models helping people like Terrance Tao and Scott Aaronson write new research papers. LLMs also beat IMO and the ICPC. We have entered the John Henry era for intellectual tasks...

      3 replies →

    • Sir, we're in a modern economy, we don't ever ever look at productivity graphs (this is not to disparage LLMs, just a comment on productivity in general)

    • It has! CLs/engineer increased by 10% this year.

      LLMs from late 2024 were nearly worthless as coding agents, so given they have quadrupled in capability since then (exponential growth, btw), it's not surprising to see a modestly positive impact on SWE work.

      Also, I'm noticing you're not explaining yourself :)

      9 replies →

    • How long before introduction of computers lead to increases in average productivity? How long for the internet? Business is just slow to figure out how to use anything for its benefit, but it eventually gets there.

      2 replies →