← Back to context

Comment by anonzzzies

2 days ago

Shrinking and speed; speed is a major thing. Claude Code is just too slow, very good but it has no reasonable way to handle simple requests because of the overhead, so then everything should just be faster. If I were Anthropic, I would've bought Groq or Cerebras by now. Not sure if they (or the other big ones) are working on similar inference hardware to provide 2000tok/s or more.

Z.ai (at least mid/top end subscription not sure about the API) is pretty slow too especially during some periods. Cerebras of course is probably a different story (if its not quantitized)