Comment by osti
1 day ago
Yeah... I would definitely call 2t/s unusable. For simple chats, I'd want at least 15 t/s. For agentic coding (which this model is advertised for), I'd want good prefill performance as well.
1 day ago
Yeah... I would definitely call 2t/s unusable. For simple chats, I'd want at least 15 t/s. For agentic coding (which this model is advertised for), I'd want good prefill performance as well.
No comments yet
Contribute on Hacker News ↗