Comment by nicebyte

1 year ago

>. they are an extremely unusual person and have spent upwards of $10,000

eh? doesn't the distilled+quantized version of the model fit on a high-end consumer grade gpu?

11 comments

nicebyte

The "distilled+quantized versions" are not the same model at all, they are existing models (Llama and Qwen) finetuned on outputs from the actual R1 model, and are not really comparable to the real thing.

raxxor 1 year ago

That is semantics and they are strongly comparable with their input and output. Distillation is different to finetuning.
Sure, you could say that only running the 600+b model is running "the real thing"...

KolmogorovComp 1 year ago

a distilled version running on another model architecture does not count as using "DeepSeek". It counts as running a Llama:7B model fine-tuned on DeepSeek.

HnUser12 1 year ago
That’s splitting hairs. Most people refer to running locally as in running model on your hardware rather than the providing company.
- bakugo 1 year ago
  
  Except you're not running the model locally, you're running an entirely different model that is deceptively named.
  You can pretend it's R1, and if it works for your purpose that's fine, but it won't perform anywhere near the same as the real model, and any tests performed on it are not representative of the real model.
  
  1 reply →
lovich 1 year ago
Pretty sure this is just layman vs academic expert usage of the word conflicting.
For everyone who doesn’t build LLMs themselves, “running a Llama:7B model fined-tuned on DeepSeek.” _is_ using Deepseek mostly on account of all the tools and files being named DeepSeek and the tutorials that are aimed as casual users all are titled with equivalents of “How to use DeepSeek locally”
- KolmogorovComp 1 year ago
  
  > “running a Llama:7B model fined-tuned on DeepSeek.” _is_ using Deepseek mostly on account of all the tools and files being named
  Most people confuse mass and weight, that does not mean weight and mass are the same thing.
  
  1 reply →