Comment by QuadmasterXLII

4 months ago

headline hundred billion parameter, none of the official models are over 10 billion parameters. Curious.

2 comments

QuadmasterXLII

The project is an inference framework which should support 100B parameter model at 5-7tok/s on CPU. No one has quantized a 100B parameter model to 1 trit, but this existing is an incentive for someone to do so.

est 4 months ago

> quantized a 100B parameter model to 1 trit
I had the same question, after some debates with Chatgpt, it's not the "quantize" for post-training we often witness these days, you have to use 1 trit in the beginning since pre-train.