Comment by echelon
8 hours ago
Local is a dead end.
Open source efforts need to give up on local AI and embrace cloud compute.
We need to stop building toy models to run on RTX and instead try to compete with the hyperscalers. We need open weights models that are big and run on H200s. Those are the class of models that will be able to compete.
When the hyperscalers reach take off, we're done for. If we can stay within ~6months, we might be able to slow them down or even break them.
If there was something 80-90% as good as Opus or Seedance or Nano Banana, more of the ecosystem would switch to open source because it offers control and sovereignty. But we don't have that right now.
If we had really competitive open weights models, universities, research teams, other labs, and other companies would be able to collaboratively contribute to the effort.
Everyone in the open source world is trying to shrink these models to fit on their 3090 instead, though, and that's such a wasted effort. It's short term thinking.
An "OpenRunPod/OpenOpenRouter" + one click deploy of models just as good as Gemini will win over LMStudio and ComfyUI trying to hack a solution on your own Nvidia gaming card.
That's such a tiny segment of the market, and the tools are all horrible to use anyway. It's like we learned nothing from "The Year of Linux on Desktop 1999". Only when we realized the data center was our friend did we frame our open source effort appropriately.
> We need open weights models that are big and run on H200s.
We have this class of models already, Kimi 2.5 and GLM-5 are proper SOTA models. Nemotron might also release a larger-sized model at some time in the future. With the new NVMe-based offload being worked on as of late you can even experiment with these models on your own hardware, but of course there's plenty of cheap third-party inference platforms for these too.
> Open source efforts need to give up on local AI and embrace cloud compute.
Oh god no, please not more slop, you're already consuming over 1 percent of human energy output, could you, like, chill a bit?
In a similar vein: seek efficiency.
I.e., /if/ I am going to consume LLM tokens, I figure that a local LLM with 10s of billions of parameters running on commodity hardware at home will still consume far more energy per token than that of a frontier model running on commercial hardware which is very strongly incentivized to be as efficient as possible. Do the math; it isn't even close. (Maybe it'd be closer in your local winter, where your compute heat could offset your heating requirements. But that gets harder to quantify.)
Maybe it's different if you have insane and modern local hardware, but at least in my situation that is not the case.
But commodity hardware that's right-sized for your own private needs is many orders of magnitude cheaper than datacenter hardware that's intended to serve millions of users simultaneously while consuming gigawatts in power. You're mostly paying for that hardware when you buy LLM tokens, not just for power efficiency. And your own hardware stays available for non-AI related needs, while paying for these tokens would require you to address these needs separately in some way.
3 replies →
Y'all aren't seeing the same future I am, I guess.
- Our career is reaching the end of the line
- 99.9999% of users will be using the cloud
- if we don't have strong open source models, we're going to be locked into hyperscaler APIs for life
- piddly little home GPUs don't do squat against this
Why are you building for hobby uses?
Build for freedom of the ability to make and scale businesses. To remain competitive. To have options in the future independent of hyperscalers.
We're going to be locked out of the game soon.
Everyone should be panicking about losing the ability to participate.
Play with your RTXes all you like. They might as well be raspberry pis. They're toys.
Our future depends on our ability to run and access large scale, competitive, open weights. Not stuff you run with LM Studio or ComfyUI as a hobby.
I don't agree that we are being left behind with regards to AI, I believe it's simply not worth participating in. I hope it all comes crashing down.
1 reply →
Man, going to personal computing was a mistake, we should’ve stayed jacked to the mainframes /s
Entire device categories, like smartphones, are locked down. That's our future.
Here's my retort: https://news.ycombinator.com/item?id=47543367