← Back to context

Comment by somewhereoutth

10 days ago

My understanding is that the cost of training each next model is very very large, and a half trained model is worthless.

Thus when it is realised that this investment cannot produce the necessary returns, there will simply be no next model. People will continue using the old models, but they will become more and more out of date, and less and less useful, until they are not much more than historical artifacts.

My point is that the threshold for continuing this process (new models) is very big (getting bigger each time?), so the 'pop' will be a step function to zero.

Why do you think the models will become out of date and less useful? Like, compared to what? What external factor makes the models less useful?

If it's just to catch up with newly discovered knowledge or information then that's not the model, they can just train again with an updated dataset and probably not need to train from scratch.

  • > What external factor makes the models less useful?

    Life. A great example can be seen in the AI-generated baseball-related news articles that involve the Athletics organization. AI articles this year have been generating articles that incorrectly state that the Atlanta Braves played in games that were actually played by the Athletics, and the reason is due to the outdated training model. For the last 60 years before 2025, the Athletics played in Oakland, and during that time their acronym was OAK. In 2025, they left Oakland for Sacramento, and changed their acronym to ATH. The problem is that AI models are trained on 60 years of data where 1. team acronyms are always based on the city, rather than the mascot of the team, and 2. acronyms OAK = Athletics, ATL = Atlanta Braves, and ATH = nothing. As a result, an AI model that doesnt have context "OAK == ATH in the 2025 season" will see ATH in the input data, associates ATH with nothing in it's model, and will then erroneously assume ATH is a typo for ATL.

If they stop getting returns in intelligence, they will switch to returns in efficiency and focus on integration, with new models being trained for new data cutoffs if nothing else. Even today there is at least a 5-10 year integration/adoption period if everything halts tomorrow.

There is no reality in which LLMs go away (shy of being replaced).

  • > If they stop getting returns in intelligence, they will switch to returns in efficiency

    I don't think we can assume that people producing what appear to be addictive services are going to do that, especially when they seem to be addicted themselves.

  • Is adding new data to a model a full retraining from scratch? Or can it be added on top of an existing model?

    If it costs $10B to add 1 year of data to an existing model, every year, that doesn’t sound too good.

Many argument the current batch of models provide a large capability overhang. That is, we are still learning how to get the most out of these models in various applications

Models don't become out of date now that deep research exists.

  • So with every prompt you are expected to wait that long? I highly doubt general people will be willing to wait, it also doesn't seem entirely viable if you want to do it locally, less bandwidth and no caching akin to what a literal search engine of data can do.