Comment by Aerroon
3 days ago
I have doubts about this. Perhaps the closed models have, but I wouldn't be so sure for the open ones.
GLM 5, for example, is running 16-bit weights natively. This makes their 755B model 1.5TB in size. It also makes their 40B active parameters ~80GB each.
Compare this to Kimi K2.5. 1T model, but it's 4-bit weights (int4), which makes the model ~560 GB. Their 32B active parameters are ~16 GB.
Sure, GLM 5 is the stronger model, but is that price worth paying with 2-3x longer generation times? What about 2-3x more memory required?
I think this barrel's bottom really hasn't been scraped.
No comments yet
Contribute on Hacker News ↗