← Back to context

Comment by dingocat

6 months ago

What do you mean there is no such thing as R1-1.5b? DeepSeek released a distilled version based on a 1.5B Qwen model with the full name DeepSeek-R1-Distill-Qwen-1.5B, see chapter 3.2 on page 14 of their research article [0].

[0] https://arxiv.org/abs/2501.12948

ollama labels the qwen models R1, while the "R1" moniker standing on its own in deepseek world means the full model that has nothing to do with qwen.

https://ollama.com/library/deepseek-r1

That may have been ok if it was just same model at different sizes but they're completely different things here & it's created confusion out of thin air for absolutely no reason other than ollama being careless.

  • And their documentation makes that distinction clear, having dedicated a section specifically to the distilled models.