Comment by HarHarVeryFunny
3 months ago
Probably the API - there is certainly a difference, and I doubt the goal of someone putting out an article like this was to make it look good.
It's anyway missing the point - if you don't like the model then just read the paper and replicate the process. The significance of DeepSeek-R isn't the trained model itself - it's how they got there, and the efficiency.
No comments yet
Contribute on Hacker News ↗