Comment by embedding-shape
3 months ago
Depends heavily on the architecture too, I think a free-for-all to find the better sizes is still kind of ongoing, and rightly so. GPT-OSS-120B for example fits in around 61GB VRAM for me when on MXFP4.
Personally, I hope GPU makers instead start adding more VRAM, or if one can dream, expandable VRAM.
Unlikely to see more VRAM in the short term, memory prices are thru the roof :/ like, not subtly, 2-4x.
Well, GPUs are getting more VRAM, although it's pricey. But we didn't used to have 96GB VRAM GPUs at all, now they do exist :) But for the ones who can afford it, it is at least possible today. Slowly it increases.
Agreed, in the limit, RAM go up. As billg knows, 128KB definitely wasn't enough for everyone :)
3 replies →