← Back to context

Comment by himata4113

3 days ago

I personally don't buy it, cerebras is way more advanced than this, comparing this tok/s to cerebras is disingenious.

Cerebras is a totally different product though. They can (theoretically) run any frontier model provided it gets compiled a certain way. Like a wafer scale TPU.

This is using hardwired weights with on-die SRAM used for K/V for example. It's WAY more power efficient and faster. The tradeoff being it's hardwired.

Still, most frontier models are "good enough" where an obscenely fast version would be a major seller.