Comment by siwakotisaurav

1 year ago

Models other than the 600b one are not R1. It’s crazy how many people are conflating distilled qwen and llama 1 to 70b models as r1 when saying they’re hosting them locally

The point does stand if you’re talking about using deepseek r1 zero instead which afaik you can try on hyperbolic and it apparently even answers the tianmen square question.