← Back to context

Comment by jonnycat

1 month ago

I think this argument only makes sense if you believe that AGI and/or unbounded AI agents are "right around the corner". For sure, we will progress in that direction, but when and if we truly get there–who knows?

If you believe, as I do, that these things are a lot further off than some people assume, I think there's plenty of time to build a successful business solving domain-specific workflows in the meantime, and eventually adapting the product as more general technology becomes available.

Let's say 25 years ago you had the idea to build a product that can now be solved more generally with LLMs–let's say a really effective spam filter. Even knowing what you know now, would it have been right at the time to say, "Nah, don't build that business, it will eventually be solved with some new technology?"

I don't think it's that binary. We've had a lot of progress over the last 25 years; much of it in the last two. AGI is not a well defined thing that people easily agree on. So, determining whether we have it or not is actually not that simple.

Mostly people either get bogged down into deep philosophical debates or simply start listing things that AI can and cannot do (and why they believe why that is the case). Some of those things are codified in benchmarks. And of course the list of stuff that AIs can't do is getting stuff removed from it on a regular basis at an accelerating rate. That acceleration is the problem. People don't deal well with adapting to exponentially changing trends.

At some arbitrary point when that list has a certain length, we may or may not have AGI. It really depends on your point of view. But of course, most people score poorly on the same benchmarks we use for testing AIs. There are some specific groups of things where they still do better. But also a lot of AI researchers working on those things.

  • What acceleration?

    Consider OpenAI's products as an example. GPT-3 (2020) was a massive step up in reasoning ability from GPT-2 (2019). GPT-3.5 (2022) was another massive step up. GPT-4 (2023) was a big step up, but not quite as big. GPT-4o (2024) was marginally better at reasoning, but mostly an improvement with respect to non-core functionality like images and audio. o1 (2024) is apparently somewhat better at reasoning at the cost of being much slower. But when I tried it on some puzzle-type problems I thought would be on the hard side for GPT-4o, it gave me (confidently) wrong answers every time. 'Orion' was supposed to be released as GPT-5, but was reportedly cancelled for not being good enough. o3 (2025?) did really well on one benchmark at the cost of $10k in compute, or even better at the cost of >$1m – not terribly impressive. We'll see how much better it is than o1 in practical scenarios.

    To me that looks like progress is decelerating. Admittedly, OpenAI's releases have gotten more frequent and that has made the differences between each release seem less impressive. But things are decelerating even on a time basis. Where is GPT-5?

  • >Let's say 25 years ago you had the idea to build a product

    I resemble that remark ;)

    >that can now be solved more generally with LLMs

    Nope, sorry, not yet.

    >"Nah, don't build that business, it will eventually be solved with some new technology?"

    Actually I did listen to people like that to an extent, and started my business with the express intent of continuing to develop new technologies which would be adjacent to AI when it matured. Just better than I could at my employer where it was already in progress. It took a couple years before I was financially stable enough to consider layering in a neural network, but that was 30 years ago now :\

    Wasn't possible to benefit with Windows 95 type of hardware, oh well, didn't expect a miracle anyway.

    Heck, it's now been a full 45 years since I first dabbled in a bit of the ML with more kilobytes of desktop memory than most people had ever seen. I figured all that memory should be used for something, like memorizing, why not? Seemed logical. Didn't take long to figure out how much megabytes would help, but they didn't exist yet. And it became apparent that you could only go so far without a specialized computer chip of some kind to replace or augment a microprocessor CPU. What kind, I really had no idea :)

    I didn't say they resembled 25-year-old ideas that much anyway ;)

    >We've had a lot of progress over the last 25 years; much of it in the last two.

    I guess it's understandable this has been making my popcorn more enjoyable than ever ;)

Agreed. There's a difference between developing new AI, and developing applications of existing AI. The OP seems to blur this distinction a bit.

The original "Bitter Lesson" article referenced in the OP is about developing new AI. In that domain, its point makes sense. But for the reasons you describe, it hardly applies at all to applications of AI. I suppose it might apply to some, but they're exceptions.

You think it will be 25 years before we have a drop in replacement for most office jobs?

I think it will be less than 5 years.

You seem to be assuming that the rapid progress in AI will suddenly stop.

I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

Even if there is no progress in scaling memristors or any exotic new paradigm, high speed memory organized to localize data in frequently used neural circuits and photonic interconnects surely have multiple orders of magnitude of scaling gains in the next several years.

  • > You seem to be assuming that the rapid progress in AI will suddenly stop.

    And you seem to assume that it will just continue for 5 years. We've already seen the plateau start. OpenAI has tacitly acknowledged that they don't know how to make a next generation model, and have been working on stepwise iteration for almost 2 years now.

    Why should we project the rapid growth of 2021–2023 5 years into the future? It seems far more reasonable to project the growth of 2023–2025, which has been fast but not earth-shattering, and then also factor in the second derivative we've seen in that time and assume that it will actually continue to slow from here.

    • At this point, the lack of progress since April 2023 is really what is shocking.

      I just looked on midjourney reddit to make sure I wasn't missing some new great model.

      Instead what I notice is the small variations on the themes I have already seen a thousand times a year ago now. Midjourney is so limited in what it can actually produce.

      I am really worried that all this is much closer to a parlor trick than AGI. "simple trick or demonstration that is used especially to entertain or amuse guests"

      It all feels more and more like that to me than any kind of progress towards general intelligence.

    • > OpenAI has tacitly acknowledged that they don't know how to make a next generation model

      Can you provide a source for this? I'm not super plugged into the space.

      1 reply →

  • I think you're suffering from some survivorship bias here. There are lot of technologies that don't work out.

  • > You seem to be assuming that the rapid progress in AI will suddenly stop.

    > I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

    It's better to talk about actual numbers to characterise progress and measure scaling:

    " By scaling I usually mean the specific empirical curve from the 2020 OAI paper. To stay on this curve requires large increases in training data of equivalent quality to what was used to derive the scaling relationships. "[^2]

    "I predicted last summer: 70% chance we fall off the LLM scaling curve because of data limits, in the next step beyond GPT4.

    […]

    I would say the most plausible reason is because in order to get, say, another 10x in training data, people have started to resort either to synthetic data, so training data that's actually made up by models, or to lower quality data."[^0]

    “There were extraordinary returns over the last three or four years as the Scaling Laws were getting going,” Dr. Hassabis said. “But we are no longer getting the same progress.”[^1]

    ---

    [^0]: https://x.com/hsu_steve/status/1868027803868045529

    [^1]: https://x.com/hsu_steve/status/1869922066788692328

    [^2]: https://x.com/hsu_steve/status/1869031399010832688

    • o1 proved that synthetic data and inference time is a new ramp. There will be more challenges and more innovations. There is a lot of room in hardware, software, model training and model architecture left.

      7 replies →

  • Also office jobs will be adapted to be a better fit to what AI can do, just as manufacturing jobs were adapted so that at least some tasks could be completed by robots.

  • Not my downvote, just the opposite but I think you can do a lot in an office already if you start early enough . . .

    At one time I would have said you should be able to have an efficient office operation using regular typewriters, copiers, filing cabinets, fax machines, etc.

    And then you get Office 97, zip through everything and never worry about office work again.

    I was pretty extreme having a paperless office when my only product is paperwork, but I got there. And I started my office with typewriters, nice ones too.

    Before long Google gets going. Wow. No-ads information superhighway, if this holds it can only get better. And that's without broadband.

    But that's besides the point.

    Now it might make sense for you to at least be able to run an efficient office on the equivalent of Office 97 to begin with. Then throw in the AI or let it take over and see what you get in terms of output, and in comparison. Microsoft is probably already doing this in an advanced way. I think a factor that can vary over orders of magnitude is how does the machine leverage the abilities and/or tasks of the nominal human "attendant"?

    One type of situation would be where a less-capable AI could augment a defined worker more effectively than even a fully automated alternative utilizing 10x more capable AI. There's always some attendant somewhere so you don't get a zero in this equation no matter how close you come.

    Could be financial effectiveness or something else, the dividing line could be a moving target for a while.

    You could even go full paleo and train the AI on the typewriters and stuff just to see what happens ;)

    But would you really be able to get the most out of it without the momentum of many decades of continuous improvement before capturing it at the peak of its abilities?

We already have AGI in some ways though. Like I can use Claude for both generating code and helping with some maths problems and physics derivations.

It isn't a specific model for any of those problems, but a "general" intelligence.

Of course, it's not perfect, and it's obviously not sentient or conscious, etc. - but maybe general intelligence doesn't require or imply that at all?

  • For me, general intelligence from a computer will be achieved when it knows when it's wrong. You may say that humans also struggle with this, and I'd agree - but I think there's a difference between general intelligence and consciousness, as you said.

    • Being wrong is one thing, on the other hand knowing that they don't know something is something humans are pretty good at (even if they might not admit to not knowing something and start bullshitting anyways). Current AI predictably fails miserably every single time.

      1 reply →