← Back to context

Comment by lolinder

19 hours ago

I was building things with GPT-2 in 2019. I have as much experience engineering with them as anyone who wasn't an AI researcher before then.

And no, we're not at a fundamentally different place than we were just 12 months ago. The last 12 months had much slower growth than the 12 months before that, which had slower growth than the 12 months before that. And in the end these tools have the same weaknesses that I saw in GPT-2, just to a lesser degree.

The only aspect in which we are in a fundamentally different place is that the hype has gone through the roof. The tools themselves are better, but not fundamentally different.

It’s genuinely difficult to take seriously a claim that coding using Sonnet has “the same weaknesses” as GPT-2, which was effectively useless for the task. It’s like suggesting that a flamethrower has the same weaknesses as a matchstick because they both can be put out by water.

We’ll have to agree to disagree about whether the last 12 months has had as much innovation as the preceding 12 months. We started 2024 with no models better than GPT-4, and we ended the year with multiple open source models that beat GPT-4 and can run on your laptop, not to mention a bunch of models that trounce it. Plus tons of other innovations, dramatically cheaper training and inference costs, reasoning models, expanded multi-modal capabilities, etc, etc.

I’m guessing you’ve already seen and dismissed it, but in case you’re interested in an overview, this is a good one: https://simonwillison.net/2024/Dec/31/llms-in-2024/

  • I'm paying for o1-pro (just for one month) and have been using LLMs since GPT-2 (via AI Dungeon). Progress is absolutely flattering when you're looking at practical applications versus benchmarks.

    o1 is actually surprisingly "meh" and I just don't see how they can justify the price when sonnet 3.5 latest is almost as good, 10x as fast and doesn't even have "reasoning".

    I'm spending half my day every day for the past few years using LLMs in one way or another. They still confidently (and unpredictability) hallucinate, even o1. They have no memory, can't build up experience, performance rapidly degrades with long conversations, etc.

    I'm not saying progress isn't being made, but the rate of progress is definitely slowing.