Comment by anonyfox
6 hours ago
Sounds like moneygrab is accelerating before consumer grade local models are getting good enough for local inference in few years. Huge house of cards here. Demand skyrocketing until it’s suddenly dropping entirely with ondevice inference.
I'm already living in this future. In a decent execution framework, with context management, memory via unix, and mechanisms for web search and access, local models are effectively on par with frontier ones. And they can often be much faster. I'll keep paying fees for the AI companies until they stop truly subsidizing and leading. They are getting close to the edge of utility, but we can use their services now to bootstrap their own demise. Long live running your own software on your own computer.
What setup are you using? What models, what hardware, what agent harness, etc? I have the vague sense that this is all possible right now, but the amount of tinkering required doesn't seem worth it compared to, like, just not using AI and getting stuff done the old fashioned way.
I just don't believe you.
We can all see the vast gulf between paid + open AI in image and video, it's really visible. Compare Grok to wan or LTX or whatever and the difference is vast. There is no debate that those sort of models are 3 or 4 generations behind, because you can't argue with your eyes.
But DIYers like you claim that text LLMs are up to scratch with the frontier models?
Again, I simply don't believe you. I can't be bothered to download like however many GB it is to find out, because the result is going to be completely underwhelming and going back to 2023.
And worse, when these 'open' models do start getting good, what makes you think these companies will carry on open sourcing their models?
At the moment they're trying to stay relevant, get investment. When these models do start getting good, they won't give away the weights, they'll sell them.
They're not actually open.
And then in a year or two your 'open' model will be horrifically out-of-date with completely out of date knowledge, because you can't add to the knowledge of the model, it's stuck at whatever date the data it was trained on finished.
So in a year or two, those models will be worthless. That's why Ali, Meta, etc. are giving them away.
> consumer grade local models are getting good enough for local inference
I am waiting for that. Perhaps a taalas kind of high-performance custom hw coding llm engine paired with an open-source coding-agent. Priced like a high-end graphics card which would be pay off over time. It will be a replay of the ibm-mainframe to PC transition of a previous era.
> I am waiting for that
Same, and I think we're close. "The original 1984 128k Mac model was $2,495, and the 1985 512k Mac was $2,795" [1]. That's $8 to 9 thousand today. About the price of a 32-core, 80-GPU M3 Ultra Mac Studio with 256 GB RAM.
[1] https://blog.codinghorror.com/a-lesson-in-apple-economics/
[2] https://www.bls.gov/data/inflation_calculator.htm
The maxed out 512GB RAM Mac Studio is no longer available from Apple and is now pushing $20 thousand in the secondary market. And we might not even see a new Mac Studio release from Apple before October.
The consumer models are quite good already, the main bottleneck on local inference is hardware. But even then you can run tiny models on mostly anything, things only get harder as you try to scale up to more knowledgeable models and a larger context.