Comment by dingocat
6 months ago
What do you mean there is no such thing as R1-1.5b? DeepSeek released a distilled version based on a 1.5B Qwen model with the full name DeepSeek-R1-Distill-Qwen-1.5B, see chapter 3.2 on page 14 of their research article [0].
Which is not the same model, it's not R1 it's R1-Distill-Qwen-1.5B....
A distinction they make clear and write extensively about on the model page, yes?
wheres that made clear in "ollama run deepseek-r1” the command to download/run the model?
1 reply →
ollama labels the qwen models R1, while the "R1" moniker standing on its own in deepseek world means the full model that has nothing to do with qwen.
https://ollama.com/library/deepseek-r1
That may have been ok if it was just same model at different sizes but they're completely different things here & it's created confusion out of thin air for absolutely no reason other than ollama being careless.
And their documentation makes that distinction clear, having dedicated a section specifically to the distilled models.