← Back to context

Comment by ttoinou

10 hours ago

Cerebras already has GLM 4.7 in the code plans

4 comments

ttoinou

Reply

vessenes 10 hours ago

Yep. But this is like 10x faster; 3B active parameters.

ttoinou 10 hours ago
Cerebras is already 200-800 tps, do you need even faster ?
- overfeed 9 hours ago
  
  Yes! I don't try to read agent tokens as they are generated, so if code generation decreases from 1 minute to 6 seconds, I'll be delighted. I'll even accept 10s -> 1s speedups. Considering how often I've seen agents spin wheels with different approaches, faster is always better, until models can 1-shot solutions without the repeated "No, wait..." / "Actually..." thinking loops
  
  1 reply →