Comment by pigpop
4 days ago
Aren't they only using the SRAM for the KV cache? They mention that the hardwired weights have a very high density. They say about the ROM part:
> We have got this scheme for the mask ROM recall fabric – the hard-wired part – where we can store four bits away and do the multiply related to it – everything – with a single transistor. So the density is basically insane.
I'm not a hardware guy but they seem to be making a strong distinction between the techniques they're using for the weights vs KV cache
> In the current generation, our density is 8 billion parameters on the hard wired part of the chip., plus the SRAM to allow us to do KV caches, adaptations like fine tuning, and etc.
No comments yet
Contribute on Hacker News ↗