Comment by webXL

4 months ago

It comes from (intentionally?) misleading docs: https://github.com/microsoft/BitNet/issues/391

(only suggesting that it's intentional because it's been there so long)

4 comments

webXL

That issue appears to be the one that's wrong. From the technical report

> We evaluated bitnet.cpp in terms of both inference speed and energy cost. Comprehensive tests were conducted on models with various parameter sizes, ranging from 125M to 100B. specific configurations for each model are detailed in the Appendix A.

webXL 4 months ago
Thanks for pointing that out. I'll ask the issue creator if they've considered that. Would be nice if the maintainer would handle that (sigh) and link to the actual models used for testing (double sigh).
- verdverm 4 months ago
  
  From what I gather, there are no models, this is a framework for running 1bit models, but none have been trained. They are mainly demonstrating the possibility.
  
  1 reply →