Comment by zoogeny
1 year ago
I think this highlights the winner-take-all stakes of intelligence. It also suggests that there is little to be gained by specialization. Building a brand might actually be more short-term profitable since you can swap in the latest AI models as they become available. In other words, if advancing the SOTA AI is your dream, a product company may not be the right place. And if building a product company is your dream, then building foundational AI might not be the best strategy.
> if advancing the SOTA AI is your dream, a product company may not be the right place.
Does Meta get in the way of this?
It's hard to compete with a company that is dead set on spending billions and seemingly wants to drive your SOTA AI product revenue to 0.
If you are OpenAI or Anthropic right now, it seems like trying to run a great restaurant at a reasonable price right next to a good (great?) restaurant that is serving everyone for free.
Yes. Meta's current strategy is extremely disruptive to other companies that are trying to build a business in the foundation model space.
Presumably this is because Meta desperately want to avoid becoming dependent on other companies in this new generative AI world. Mark Zuckerberg talks about not wanting a repeat of the Apple tax in his post about Llama 3.1 here: https://about.fb.com/news/2024/07/open-source-ai-is-the-path...
Seems to me, if the zuck is dropping a gorillion dollars to put out free trained models out there (for small outfits to fine-tune to their purposes), the play is now to take the enormous field of small outfits and put them on a level playing field against zuck's enemies such as google. so he's wrecking gogle's chances indirectly, clever play. Now as far as zuck, yes, not ideal all his tech is out there, but that doesn't mean he doesn't have better tech behind lock doors, and the silver lining is now in effect he has a gorillion people tuning and optimizing HIS model - priceless. I wish google entered into such a war to be honest.. and especially openAI
My take is that this has more to do with the coming years than the current climate.
I think it is just a consequence of the cost of getting to the next level of AI. The estimates for training a GPT-5 level foundational model are on the order of 1 billion. It isn't going to get cheaper from there. So even if your model is a bit better than the free models available today, unless you are spending that 1 billion+ today then you are going to look weak in 6 months to 1 year. And by then the GPT-6+ model training costs will be even higher, so you can't just wait and play catch up. You are probably right as well, in that there is a fear that a competitor based on an open source model gets close enough in capability to generate bad publicity.
I imagine character.ai (like inflection) did calculations and realized that there was no clear path to recoup that magnitude of investment based on their current product lines. And when they brainstormed ways to increase return they found that none of the paths strictly required a proprietary foundational model. Just my speculation, of course.
What does "GPT-5" and "GPT-6" even mean? I gently suggest they aren't currently meaningful, it's not like CPU Ghz frequency steppings. If anything it's more akin to chip fab processes, e.g. 10nm, 5nm, 3nm. Each reduction in feature size requires new physical technology, chip architecture, and a black box bag of tricks to eke out better performance.
Where is the data that costs a billion dollars to train on going to come from? These companies are already training on most of the available valuable information that exists.
While training will surely be expensive, I think it's even more expensive and challenging to organize and harness the brainpower to figure out and execute the next meaningful step forward.
4 replies →
Specializing will happen in product implementation, not model implementation.
LLMs are become akin to tools, like programming languages. They’re blank slates, but require implementation to become special.