Comment by hadlock

15 hours ago

Consumer devices are already available that offer 128gb specifically labeled for AI use. I think server side AI will still exist for IoT devices, but I agree, 10 years seems pretty reasonable timelie to buy a GTX 5080-sized card that will have 1TB of memory, with the ability to pair it with another one for 2TB. For local, non-distributed use, GPUs are already more than capable of doing 20+ tokens/s, we're mostly waiting on 512gb devices to drop in price, and "free" LLMs to get better.

Are we constrained by RAM production?

RAM Price per GB Projected to decline at 15% per annum.

That's quite a few years before you'll get double the RAM.

For mobile I'm guessing power constraints matter too.