← Back to context

Comment by seanmcdirmid

3 months ago

I'm confused. According to other comment: https://news.ycombinator.com/item?id=42859645, <= 70b DeepSeek models are just a fine tuning of Llama or Qwen? So we shouldn't take any thought of these models to actually being DeepSeek.

I think people are confusing the smaller non-DeepSeek original models (Qwen/Llama) with the 700B DeepSeek R1 model being talked about in here and that very few people can run locally.