Comment by QuadmasterXLII
3 days ago
headline hundred billion parameter, none of the official models are over 10 billion parameters. Curious.
3 days ago
headline hundred billion parameter, none of the official models are over 10 billion parameters. Curious.
The project is an inference framework which should support 100B parameter model at 5-7tok/s on CPU. No one has quantized a 100B parameter model to 1 trit, but this existing is an incentive for someone to do so.
> quantized a 100B parameter model to 1 trit
I had the same question, after some debates with Chatgpt, it's not the "quantize" for post-training we often witness these days, you have to use 1 trit in the beginning since pre-train.