← Back to context

Comment by senordevnyc

10 hours ago

It’s genuinely difficult to take seriously a claim that coding using Sonnet has “the same weaknesses” as GPT-2, which was effectively useless for the task. It’s like suggesting that a flamethrower has the same weaknesses as a matchstick because they both can be put out by water.

We’ll have to agree to disagree about whether the last 12 months has had as much innovation as the preceding 12 months. We started 2024 with no models better than GPT-4, and we ended the year with multiple open source models that beat GPT-4 and can run on your laptop, not to mention a bunch of models that trounce it. Plus tons of other innovations, dramatically cheaper training and inference costs, reasoning models, expanded multi-modal capabilities, etc, etc.

I’m guessing you’ve already seen and dismissed it, but in case you’re interested in an overview, this is a good one: https://simonwillison.net/2024/Dec/31/llms-in-2024/

I'm paying for o1-pro (just for one month) and have been using LLMs since GPT-2 (via AI Dungeon). Progress is absolutely flattering when you're looking at practical applications versus benchmarks.

o1 is actually surprisingly "meh" and I just don't see how they can justify the price when sonnet 3.5 latest is almost as good, 10x as fast and doesn't even have "reasoning".

I'm spending half my day every day for the past few years using LLMs in one way or another. They still confidently (and unpredictability) hallucinate, even o1. They have no memory, can't build up experience, performance rapidly degrades with long conversations, etc.

I'm not saying progress isn't being made, but the rate of progress is definitely slowing.