Comment by 1ba9115454
1 year ago
I can't imagine this setup will get more than 1 token per second.
I would love to see Deepseek running on premise with a decent TPS.
1 year ago
I can't imagine this setup will get more than 1 token per second.
I would love to see Deepseek running on premise with a decent TPS.
It says 4.25 TPS in the first para.
Honest mistake. Some people think HN is just a series of short tweets and haven’t realized they are links yet!
It's the modern way. Why read when you can just imagine facts straight out of your own brain.
1 reply →
4.25 is enough tps for a lot of use cases.
That's still pretty slow, considering there's that "thinking" phase.
True, but 4.25 is the number we all want to know.
You can get 1t/s on a raspberry pi.
https://youtu.be/o1sN1lB76EA?si=i8ecEBjLdV0zewFQ
this has nothing to do with the full 671B and the ollama models are distilled qwen2.5
I appreciate both of these comments, thank you both.