Comment by yogthos

2 days ago

I actually expect they will keep making the models open and we'll likely converge on one or two alternatives because there's really not much difference between them at the end of the day. Everybody curating their own model is a huge duplication of effort with no clear benefit. There's a reason everybody doesn't roll their own operating systems for example.

It all comes back to what I said earlier. If you treat the model as the product, then it makes sense to keep it closed. You have some secret sauce that nobody else has, and you sell it. But the reality is that nobody has a magic formula that's significantly better than what other people can figure out. You might get an advantage for a few months tops, and then other models start catching up.

And this creates involution where you just have a race to the bottom where nobody makes any money. On the other hand, if you treat models as infrastructure, and everybody contributes to the same pool of knowledge, then you amortize the cost of making a better model. The money comes from actual products that can genuinely differentiate themselves. Companies are going to seek niches they can dominate where they do a specific thing really well. That's a much more realistic path towards long term sustainability.

And that's why I expect models are going to become infrastructure akin to Linux in the long run. They're just not where profit is.

The difference is that making Linux doesn't require a lot of money to do. Most of it is written by people for free in their spare time. They do it because they like to solve complex puzzles and be recognized within the opensource community.

OTOH, training a model requires a lot of hardware and energy to do, and the money has to come from somewhere.

Do you think that China's government is going to pay for it and release it openly to the world for the purpose of goodwill towards China, or some other reason? What would it be? Or would some other groups do it?

  • Linux gets a lot of funding and corporate development done actually. And training models is also getting cheaper every year. There are also projects like Petals which allow you to train models in a distributed fashion https://github.com/bigscience-workshop/petals

    And yeah, I do think China's government is going to continue subsidizing this tech because this tech is being used all over the place in China now. Meanwhile, the models aren't developed by the government, they're developed by individual countries that get subsidies. As I've explained above, converging on common infrastructure is going to save resources for all these companies. And continuing to work in the open with the rest of the world means getting the benefit of having a global community of researchers helping advance this tech forward.

    It's not just altruism or clout. American companies working on closed models have to foot the bill for all the research, and they're limited to the brainpower within the company. And they're competing with Chinese which have much bigger research community contributing to developing their models.

    If the model itself is not the product, then American companies find themselves in a situation where they're spending a ton of resources on something that's not their core business.