Comment by rahimnathwani
3 months ago
The model you were using was created by Qwen, and then finetuned for reasoning by Deepseek.
- Deepseek didn't design the model architecture
- Deepseek didn't collate most of the training data
- Deepseek isn't hosting the model
Yes, 100%. However, the distilled models are still pretty good at sticking to their approach to censorship. I would assume that the behavior comes from their reasoning patterns and fine tuning data, but I could be wrong. And yes, DeepSeek’s hosted model has additional guardrails evaluating the output. But those aren’t inherent to the model itself.