Comment by deadbabe
18 hours ago
Once you can run a GPT5 level LLM locally on a device, it’s over. All this mighty infrastructure is no longer any more impressive than a top of the line 2013 Mac Pro in 2025. I think we’re 10 years away from that.
18 hours ago
Once you can run a GPT5 level LLM locally on a device, it’s over. All this mighty infrastructure is no longer any more impressive than a top of the line 2013 Mac Pro in 2025. I think we’re 10 years away from that.
10 years from now, the capabilities of gpt5 will be as relevant to AI as Atari is to modern gaming
Unless GPT 5 is more like the PlayStation 5 and not the Atari 2600.
So people will be paying money for the nostalgia of ChatGPT after it dies? That tracks.
Walmart will sell cheap Chinese emulations to hop on the nostalgia train too.
I doubt it. Newer state of the art models might be a little better, but not enough to justify paying $1000/month for the average person or employee.
If you can get a GPT5 level AI, locally and privately, for just the cost of electricity, why would you bother with anything else? If it can’t do something, you’d just outsource that one prompt to a cloud based AI.
The vast majority of your prompts will be passing through a local LLM first in 2035, and you might rarely need to touch an agent API. So what does that mean for the AI industry?
Consumer devices are already available that offer 128gb specifically labeled for AI use. I think server side AI will still exist for IoT devices, but I agree, 10 years seems pretty reasonable timelie to buy a GTX 5080-sized card that will have 1TB of memory, with the ability to pair it with another one for 2TB. For local, non-distributed use, GPUs are already more than capable of doing 20+ tokens/s, we're mostly waiting on 512gb devices to drop in price, and "free" LLMs to get better.
Are we constrained by RAM production?
RAM Price per GB Projected to decline at 15% per annum.
That's quite a few years before you'll get double the RAM.
For mobile I'm guessing power constraints matter too.
Have you tried the reasoning mode of Gemini Pro 2.5? https://aistudio.google.com/
It gives me the chills, thinking about when it has 1000x cheaper ~GPU compute.
This a hosted model with closed weights, though.
Yes. Not the point.