Comment by sylware
3 months ago
And from people doing the same I got signals that it is clearly stating that china is authoritative regime and does properly describe it.
Got mixed signals about this.
Namely only if I run myself the 671B, won't be able to trust any news about it.
Western AIs as well, are full of propaganda and contain clear political opinionated points of view of just about everything, specially the frontier models offered as SAAS.
That's why the rest of the world mostly doesn't care about the obvious pro-China propaganda inside Deepseek, it is just more of the same, but making look good the other side this time.
Remember that huge parts of the planet's populations (including good chunks of people geographically located in North America and Western Europe), do not feel specially close to any, being US or China (and/of allies/friends), nor particularly share or aligns to their points of view of most of things.
Are those outputs actually from the 671B model? The 671B model needs 8xH200 GPUs at minimum, which is $25/hr to rent. If you didn't pay that much, you were not running R1, but rather Qwen or LLaMA based distillations. We paid that much to rent a machine to run the full 671B model!
Nope, you can run the 671B on 100% CPU and storage. It is going to be longer to get tokens out of it, but it will work.
Heard there are some optimizations for CPU inference on storage, then it should be somewhat a tad "less slow".
Time to split that RAM among your CPU cores and mmap blocks of weights for inference from storage.
Sure but he explicitly stated, 'GPU Servers', making it likely he didn't use the CPU for inferencing, validating the question about what GPU setup did they use