Comment by anonyfox
4 hours ago
Sounds like moneygrab is accelerating before consumer grade local models are getting good enough for local inference in few years. Huge house of cards here. Demand skyrocketing until it’s suddenly dropping entirely with ondevice inference.
I'm already living in this future. In a decent execution framework, with context management, memory via unix, and mechanisms for web search and access, local models are effectively on par with frontier ones. And they can often be much faster. I'll keep paying fees for the AI companies until they stop truly subsidizing and leading. They are getting close to the edge of utility, but we can use their services now to bootstrap their own demise. Long live running your own software on your own computer.
> consumer grade local models are getting good enough for local inference
I am waiting for that. Perhaps a taalas kind of high-performance custom hw coding llm engine paired with an open-source coding-agent. Priced like a high-end graphics card which would be pay off over time. It will be a replay of the ibm-mainframe to PC transition of a previous era.
> I am waiting for that
Same, and I think we're close. "The original 1984 128k Mac model was $2,495, and the 1985 512k Mac was $2,795" [1]. That's $8 to 9 thousand today. About the price of a 32-core, 80-GPU M3 Ultra Mac Studio with 256 GB RAM.
[1] https://blog.codinghorror.com/a-lesson-in-apple-economics/
[2] https://www.bls.gov/data/inflation_calculator.htm
The maxed out 512GB RAM Mac Studio is no longer available from Apple and is now pushing $20 thousand in the secondary market. And we might not even see a new Mac Studio release from Apple before October.
The consumer models are quite good already, the main bottleneck on local inference is hardware. But even then you can run tiny models on mostly anything, things only get harder as you try to scale up to more knowledgeable models and a larger context.