Comment by I_am_tiberius

9 hours ago

Open weight!

10 comments

I_am_tiberius

Please don't slander the most open AI company in the world. Even more open than some non-profit labs from universities. DeepSeek is famous for publishing everything. They might take a bit to publish source code but it's almost always there. And their papers are extremely pro-social to help the broader open AI community. This is why they struggle getting funded because investors hate openness. And in China they struggle against the political and hiring power of the big tech companies.

Just this week they published a serious foundational library for LLMs https://github.com/deepseek-ai/TileKernels

Others worth mentioning:

https://github.com/deepseek-ai/DeepGEMM a competitive foundational library

https://github.com/deepseek-ai/Engram

https://github.com/deepseek-ai/DeepSeek-V3

https://github.com/deepseek-ai/DeepSeek-R1

https://github.com/deepseek-ai/DeepSeek-OCR-2

They have 33 repos and counting: https://github.com/orgs/deepseek-ai/repositories?type=all

And DeepSeek often has very cool new approaches to AI copied by the rest. Many others copied their tech. And some of those have 10x or 100x the GPU training budget and that's their moat to stay competitive.

The models from Chinese Big Tech and some of the small ones are open weights only. (and allegedly benchmaxxed) (see https://xcancel.com/N8Programs/status/2044408755790508113). Not the same.

patshead 8 hours ago
DeepSeek's models are indeed open weight. Why do you feel that pointing this out would be considered slander?
- culi 5 hours ago
  
  I think they were reading GP's comment as a correction. Like "not open-source, just open weight". I'm not sure if their reading was accurate but I enjoyed their high effort comment nonetheless
kortilla 7 hours ago
It’s not slander to say something true. These are open weights, not open source. They don’t provide the training data or the methodology requires to reproduce these weights.
So you can’t see what facts are pruned out, what biases were applied, etc. Even more importantly, you can’t make a slightly improved version.
This model is as open source as a windows XP installation ISO.
- alecco 6 hours ago
  
  > These are open weights, not open source.
  Did you even read my comment?
  
  1 reply →

0-_-0 8 hours ago

Weights are the source, training data is the compiler

crazylogger 8 hours ago
Training data == source code, training algorithm == compiler, model weights == compiled binary.
- 0-_-0 7 hours ago
  
  Training algorithm is the programmer, weights are the code that you run in an interpreter
ngruhn 7 hours ago

isn't it more like the data is the source, the training process is the compiler, and the weights are the binary output.