← Back to context

Comment by zachthewf

6 hours ago

I didn't read the posted article (I don't read this author anymore because I think it's basically anti-AI ideological propaganda).

But from the article I linked back in March 2024:

"Generative AI models are expensive and compute-intensive without providing obvious, tangible mass-market use cases. Murati and Altman's futures depend heavily on keeping the world believing that development and improvement of their models' capabilities will continue a rapacious pace of progress that has unquestionably slowed, with OpenAI admitting that GPT-4 may be worse on some tasks.

As I've written before, hallucinations are a feature not a bug. These models do not "know" anything. They are mathematical behemoths generating a best guess based on training data and labeling, and thus do not "know" what you are asking it to do. You simply cannot fix them. Hallucinations are not going away."

Since then:

- hallucinations are dramatically less of a problem

- several mass market use cases have emerged, most notably coding

- rate of progress has increased

I think the points you raise are reasonable signals to consider, but I don't think they show the author being "consistently wrong". The overall thesis still remains plausible even though we have seen LLMs continue to improve.

> - hallucinations are dramatically less of a problem

Sure, but it remains a big enough problem that human intervention and review is still necessary for any serious work across all use cases and industries.

> - several mass market use cases have emerged, most notably coding

Coding seems to be the only one, but there are still a lot of open questions about how the market can sustain the costs, and that's without considering the market dynamics that could emerge once costs are lowered enough that open source models start to become an attractive option.

> - rate of progress has increased

Debatable.

  • > Sure, but it remains a big enough problem that human intervention and review is still necessary for any serious work across all use cases and industries.

    Another important consideration: Hallucinations getting less common/severe but not (as-good-as) solved makes them worse.

    LLMs used to very obviously get things wrong. And people wouldn't trust them. Now they're good enough that people blindly trust them.

    Now people just directly PR AI output with little to no manual review. We even have clowns calling for the complete abolition of directly human-authored code.

    Whatever gains were had in better AI code output over the past two years I lose in having to review much more thoroughly.

Hallucinations are still a problem. I recently asked one to give me a quote from a book, figuring that since these AI companies have pirated all books in existence surely it can just recite a specific passage no? It hallucinated the quote, I had even told it what chapter it was in. Had I not read the book recently maybe I would've believed the hallucinated quote.

And it got me thinking, they sell these AIs as assistants, but it couldn't even look up a passage from a book. This is basic, elementary stuff, it should get it right. I would have fired this assistant right away if it were a person. Not only did it get it totally wrong, it came to me with utmost confidence that this is the quote from the book. Unreliable assistants? That's the product they're trying to sell? Get out of here with that trash. I can't trust it.

Has rate of progress increased? How does one measure that? Genuinely curious - would be very interesting to map out the "effectiveness" of each AI model vs how long it took to train/release.

From my perspective, the model gains are mostly incremental now and a lot of the gains are just from things like improving the agent harnesses. I could be wrong though.

  • On the front page right now is the newest announcement from Xiaomi serving large model at over 1,000 tok/s on standard server gpus.

    Every facet of the field is being pushed on and advanced at the same time.

> hallucinations are dramatically less of a problem

No they aren't. The models still hallucinate just like they always did. You cannot trust them, ever, to get something right.

> several mass market use cases have emerged, most notably coding

They aren't really useful for coding based upon the above. Since you can't trust them, you have to carefully review everything they make, which in turn destroys any productivity they could've given you.

> rate of progress has increased

I have yet to see any progress. Opus 4.8 that you get today is no more effective than GPT-3.5 was. Much less would I agree that the rate of progress has increased. Only hype has increased, but there has yet to be a drop of substance.

> several mass market use cases have emerged, most notably coding

Most notably? This is not a mass market use case in the way the author is describing. They are asserting that the amount of spend they need to get this off the ground necessitates the entire world coming in on it, and I would say that opinion has aged pretty well. There are a lot of coders, but there are more people scratching their heads as AI is shoved into every part of their lives.