Comment by ninjahawk1

20 hours ago

The way to develop in this space seems to be to give away free stuff, get your name out there, then make everything proprietary. I hope they still continue releasing open weights. The day no one releases open weights is a sad day for humanity. Normal people won’t own their own compute if that ever happens.

69 comments

ninjahawk1

culi 19 hours ago

I think that's an overgeneralization. We've seen all the American models be closed and proprietary from the start. Meanwhile the non-American (especially the Chinese ones) have been open since the start. In fact they often go the opposite direction. Many Chinese models started off proprietary and then were later opened up (like many of the larger Qwen models)

robot_jesus 19 hours ago
> We've seen all the American models be closed and proprietary from the start
What about Gemma and Llama and gpt-oss, not to mention lots of smaller/specialized models from Nvidia and others?
I would never argue that China isn't ahead in the open weights game, of course, but it's not like it's "all" American models by any stretch.
- walthamstow 19 hours ago
  
  gpt-oss is good but I haven't heard anything about an update. It seems like one and done, to shut up people complaining about non-Open AI
- InkCanon 5 hours ago
  
  The more accurate version is only Chinese companies (plus Facebook briefly) really open source their frontier models. The rest are non frontier. They are either older or specialized for something.
- 1dom 4 hours ago
  
  It's all openwashing, all of the ones you listed at somepoint have expressed how important and valuable open weights and locally usable models are. Every single one of them has then increasingly focused and pushed closed, proprietary or cloud usable only options since saying/doing that.
  I'm annoyed at myself, because I thought/hoped/praised chinese AI when they were opening up as Llama was closing, but Qwen looks to be doing the same playbook here as Llama/Meta, Gemma/Google and OpenAI/gpt-oss.
3836293648 43 minutes ago

GPT started off open? They just closed before anyone else even joined the space
embedding-shape 19 hours ago
> We've seen all the American models be closed and proprietary from the start.
Most*.
OpenAI, contrary to popular belief, actually used to believe in open research and (more or less) open models. GPT1 and GPT2 both were model+code releases (although GPT2 was a "staged" release), GPT3 ended up API-only.
- culi 19 hours ago
  
  That's fair but those days seem so long gone now.
  Also the Chinese models aren't following a typical American SaaS playbook which relies on free/cheap proprietary software for early growth. They are not just publishing their weights but also their code and often even publishing papers in Open Access journals to explicitly highlight what methods and advancements were made to accomplish their results
  
  4 replies →
- zozbot234 19 hours ago
  
  OpenAI has released their GPT-OSS series more recently.
  
  2 replies →

visarga 20 hours ago

I think it is in the interest of chip makers to make sure we all get local models

qalmakka 19 hours ago
I think they're in a win-win situation. Big AI companies would love to see local computing die in favour of the cloud because they are well aware the moment an open model that can run on non ludicrous consumer hardware appears, they're screwed. In this situation Nvidia, AMD and the like would be the only ones profiting from it - even though I'm not convinced they'd prefer going back to fighting for B2C while B2B Is so much simpler for them
- zozbot234 19 hours ago
  
  If you want to run AI models at scale and with reasonably quick response, there's not many alternatives to datacenter hardware. Consumer hardware is great for repurposing existing "free" compute (including gaming PCs, pro workstations etc. at the higher end) and for basic insurance against rug pulls from the big AI vendors, but increased scale will probably still bring very real benefits.
  
  4 replies →
- BobbyJo 19 hours ago
  
  At a consistent amount of usage, datacenters are at least an order of magnitude more hardware efficient. I'm sure Nvidia and AMD would be fine fighting for B2C if it meant volume would be 10+x.
  Now, given they can't satisfy current volume, they are forced to settle for just having crazy margins.
  
  3 replies →
- ycui1986 11 hours ago
  
  There are also many Chines AI-target GPU/NPU producers. You can get a hold of some boards on taobao.com. They are usable in some way.
  No, nVidia and AMD are not the only ones benefiting.
zozbot234 20 hours ago

Definitely. Many big hardware firms are directly supporting HuggingFace for this very reason.
ninjahawk1 20 hours ago

True, chip companies have the opposite mindset, Nvidia is making their own open weights I believe

elorant 19 hours ago

This is obviously a strategic move at a national level. Keep publishing competing free models to erode the moat western companies could have with their proprietary models. As long as the narrative serves China there will be no turn to proprietary models.

Barrin92 9 hours ago
>This is obviously a strategic move at a national level.
no it isn't. That's the kind of thing people say who've never worked in the Chinese software ecosystem. It's how the Chinese internet has worked for 20+ years. The Chinese market is so large and competition is so rabid that every company basically throws as much free stuff at consumers as they can to gain users. Entrepreneurs don't think about "grand strategic moves at the national level" while they flip through their copies of the Art of War and Confucius lol
- elorant 2 hours ago
  
  If this was true then they’d build services around those models and provide those for free or vastly cheaper than western competition. But that’s not what they’re doing. Instead they’re giving away the entire model for free. And by the way, Qwen isn’t build from some random entrepreneur who’s trying to solve the cold start problem, but from Alibaba which is a fucking behemoth. And surprisingly of course none of these models answer uncomfortable questions about China’s past. Because sure enough, the first thing any entrepreneur would think is to protect their government and their history. Sure, happens all the time, no state interference here, move on.
  
  1 reply →

stingraycharles 9 hours ago

That has been a viable commercial strategy for most modern, funded businesses. Capture market share at a loss, then once name is established turn on the profit.

try-working 12 hours ago

Exactly. Open source is a commercial strategy for Chinese labs. They have no other effective way of marketing their models and inference services: https://try.works/writing-1#why-chinese-ai-labs-went-open-an...

baq 20 hours ago

Always has been, it’s literally saas; the slight difference is that the lowest tier subscriptions at the frontier labs are basically free trials nowadays, too

Zavora 19 hours ago

Its the new freeware model!

CamperBob2 20 hours ago

I'm a little more optimistic than that. I suspect that the open-weight models we already have are going to be enough to support incremental development of new ones, using reasonably-accessible levels of compute.

The idea that every new foundation model needs to be pretrained from scratch, using warehouses of GPUs to crunch the same 50 terabytes of data from the same original dumps of Common Crawl and various Russian pirate sites, is hard to justify on an intuitive basis. I think the hard work has already been done. We just don't know how to leverage it properly yet.

thesz 19 hours ago
Change layer size and you have to retrain. Change number of layers and you have to retrain. Change tokenization and you have to retrain.
- altruios 19 hours ago
  
  Hopefully we will find a way to make it so that making minor changes don't require a full retrain. Training how to train, as a concept, comes to mind.
- CamperBob2 18 hours ago
  
  And yet the KL divergence after changing all that stuff remains remarkably similar between different models, regardless of the specific hyperparameters and block diagrams employed at pretraining time. Some choices are better, some worse, but they all succeed at the game of next-token prediction to a similar extent.
  To me, that suggests that transformer pretraining creates some underlying structure or geometry that hasn't yet been fully appreciated, and that may be more reusable than people think.
  Ultimately, I also doubt that the model weights are going to turn out to be all that important. Not compared to the toolchains as a whole.
  
  1 reply →
- dTal 18 hours ago
  
  None of that is true, at least in theory. You can trivially change layer size simply by adding extra columns initialized as 0, effectively embedding your smaller network in a larger network. You can add layers in a similar way, and in fact LLMs are surprisingly robust to having layers added and removed - you can sometimes actually improve performance simply by duplicating some middle layers[0]. Tokenization is probably the hardest but all the layers between the first and last just encode embeddings; it's probably not impossible to retrain those while preserving the middle parts.
  [0] https://news.ycombinator.com/item?id=47431671 https://news.ycombinator.com/item?id=47322887
  
  3 replies →
pduggishetti 19 hours ago

I do not think it's common crawl anymore, its common crawl++ using paid human experts to generate and verify new content, weather its code or research.
I believe US is building this off the cost difference from other countries using companies like scale, outlier etc, while china has the internal population to do this

testbjjl 20 hours ago

Any reason for them to do this other than altruism? I don’t think this can be regulated.

Rohansi 20 hours ago

Bake ads into them.

WarmWash 19 hours ago

The Chinese state wants the world using their models.

People think that Chinese AI labs are just super cool bros that love sharing for free.

The don't understand it's just a state sponsored venture meant to further entrench China in global supply and logistics. China's VCs are Chinese banks and a sprinkle of "private" money. Private in quotes because technically it still belongs to the state anyway.

China doesn't have companies and government like the US. It just has government, and a thin veil of "company" that readily fool westerners.

subw00f 19 hours ago
As opposed to the US, which just has companies and a thin veil of “government”.
- culi 19 hours ago
  
  Also many of these Chinese companies aren't just opening their weights. They are open sourcing their code AND publishing detailed research papers alongside them to reveal how they accomplished what they accomplished.
  That's very different from an American SaaS model which relies of free but proprietary software for early growth
zozbot234 19 hours ago
I'm not sure how local AI models are meant to "entrench China in global supply and logistics". The two areas have nothing to do with one another. You can easily run a Chinese open model on all-American hardware.
- WarmWash 19 hours ago
  
  They are building a pipeline, and the goal is to get people in the door.
  If you forever stand at the entrance eating the free samples, that's fine, they don't care. Other people are going through the door and you are still consuming what they feed you. Doesn't mean it's going to be bad or evil, but they are staking their territory of control.
  
  1 reply →
devilsdata 13 hours ago

I'm Aussie. Please explain to me; why should I care whether Chinese SOEs or the US tech companies are winning? Neither have my best interests at heart.
jillesvangurp 19 hours ago
Like with nuclear technology, it's not healthy for only one country to dominate AI. The cat is already out of the bag and many countries now have the ability to train and run models. Silicon Valley has bootstrapped this space. But it should be noted that they are using AI talent from all over the world and it was sort of inevitable that this technology would get around. Lots of Chinese, Indian, Russian, and Europeans are involved.
As for what comes next, it's probably going to be a bit of a race for who can do the most useful and valuable things the cheapest. If OpenAI and Anthropic don't make it, the technology will survive them. If they do, they'll be competing on quality and cost.
As for state sponsorship, a lot of things are state sponsored. Including in the US. Silicon Valley has a rich history that is rooted in massive government funding programs. There's a great documentary out there the secret history of Silicon Valley on this. Not to mention all the "cheap" gas that is currently powering data centers of course comes on the back of a long history of public funding being channeled into the oil and gas industry.
- WarmWash 19 hours ago
  
  >As for state sponsorship, a lot of things are state sponsored.
  You can make any comparison you want if you use adjectives rather than values. I can say that cars use a massive amount of water (all those radiators!) to try and downplay agricultural water usage. But its blatantly disingenuous.
  SV is overwhelmingly private (actual constitutional private) money. To the point that you should disregard people saying otherwise, just like you would the people saying cars use massive amounts of water.
OtomotO 19 hours ago
So an OPEN model that I can run on my own fucking hardware will entrench China in global supply and logistics how?
Contrary: How will the closed, proprietary models from Anthropic, "Open"AI and Co. lead us all to freedom? Freedom of what exactly? Freedom of my money?
At some point this "anti-communism" bullshit propaganda has to stop. And that moment was decades ago!
- Zetaphor 19 hours ago
  
  Anything that isn't explicitly to the benefit of US interests must be against them /s
grttsww 19 hours ago
So what?
I still prefer that over US total dominance.
Let them fight it out.
- joquarky 18 hours ago
  
  Yeah, a lot of people are still living within the paradigm of tribalism: my team good, other team bad.
  But the events of the past decade or so have clearly demonstrated that there are no "good" actors.
  I personally couldn't care less who wins in the China vs US AI competition, both sides have a long list of pros and cons.
- spwa4 19 hours ago
  
  I'd get a bit informed about what exactly Chinese dominance entails. Ask a few Uyghurs, Cantonese Hong Kongers, or even Tibetans.
  Then decide ...
  
  4 replies →
darkwater 19 hours ago

Well, isn't this what the US and really any other power in the world has always done, since forever?

ai_fry_ur_brain 18 hours ago

Why is it sad? These things are useles all around, along with the people who overuse them.

It would be a great day for humanity if people would stopping glazing text autocomplete as revolutionary.