← Back to context

Comment by yorwba

4 hours ago

The 1-bit Bonsai and Ternary Bonsai models are all based on the corresponding Qwen3 model: https://raw.githubusercontent.com/PrismML-Eng/Bonsai-demo/re... (page 4)

Thanks, already suspected as much. Also gives context to the other comment here that says it is basically equivalent in accuracy to Qwen3.5-4B. Essentially seems to be a very good quantization of that model, not a new BitNet.

  • It's a good-per-byte-but-not-in-absolute-terms quantization of Qwen3-8B that's comparable in accuracy to Qwen3.5-4B at 4-bit quantization (which makes the 4B model larger in terms of storage, though the lower number of parameters and hybrid attention give it a speed advantage if you're not bottlenecked on memory bandwidth for the model weights.)