Comment by pphysch
2 years ago
This isn't "tierificaton" or even premiumization. That may come later.
Large AI models have tight resources requirements. You physically can't use X billion parameters without ~X billion ~bytes of memory.
It makes complete sense to have these 3 "tiers". You have a max capability option, a price-performance scaling option, and an edge compute option.
> Large AI models have tight resources requirements. You physically can't use X billion parameters without ~X billion ~bytes of memory.
Well, X billion bits times the parameter bit size. For base models, those are generally 32-bit (so 4X bytes), though smaller quantizations ate possible and widely used for public models, and I would assume as a cost measure for closed hosted models as well.
Hence ~