← Back to context

Comment by pphysch

2 years ago

This isn't "tierificaton" or even premiumization. That may come later.

Large AI models have tight resources requirements. You physically can't use X billion parameters without ~X billion ~bytes of memory.

It makes complete sense to have these 3 "tiers". You have a max capability option, a price-performance scaling option, and an edge compute option.

> Large AI models have tight resources requirements. You physically can't use X billion parameters without ~X billion ~bytes of memory.

Well, X billion bits times the parameter bit size. For base models, those are generally 32-bit (so 4X bytes), though smaller quantizations ate possible and widely used for public models, and I would assume as a cost measure for closed hosted models as well.