Comment by LucidLynx
19 hours ago
The winner in the long term will be the one that will deliver the best performance and low-memory ratio for local models.
Anthropic, OpenAI and Mistral are just companies that are making money right now (still not profitable), but will lost their tractions and values in the long term.
However, I am more appealing to see how OpenCode Go subscriptions will go in the future: cheaper than big techs, more tokens, and they don't train on our data to (try to) improve...
Not training on data is a con for me not a pro. The reason Claude is so good is RL training from users' chat histories and use cases. The era of pure public data training is over, as everyone has access to this data yet only a few are frontier models.
Local models will never compete with large SOTA models, in the same way an iPhone doesn't compete with supercomputers doing nuclear simulations.
They paths will differentiate and split. Probably SOTA models will eventually be locked down and only accessible to state actors because of how expensive they will be to run (already started with Mythos).
> SOTA models will eventually be locked down
that might be true for us based providers but i dont see china turning closed source anytime soon.
a lot of chinese labs come from big non ai focused cloud services (alibaba, tencent, huawei) who want new models with higher benchmark scores and lower inference cost. they dont care if the competition gets better because its all open so they can build off each others tech, and if anything happens they got other profitable services to fall back on instead of depend on llms only like anthropic.
also the business culture is way different, in vc backed america you would get laughed out the room for saying "there is no moat we just do the same thing as everyone but better". you need to show infinite potential growth and lock everything down to prevent competition but you can get millions to start with no customers and no profits. in china its all about the real money they dont care if your margin is 10 or 90 percent as long as you stay profitable. the llm providers are profitable so they keep their business model.
You don't need a (one huge) model to do everything. You need specialized & smaller models that are very good at specific tasks. Collaborating among themselves.
The fact that we see stagnation in terms of billions of parameters shows that efficiency does not scale linearly with the model size. More of an S shaped chart. The middle was Claude 3.5. Since then, it is more about integrating and collaborating with different systems.
its a big assumption that larger models bring any measurable benefit in the long term. there's a point where its not worth paying the expense of a bigger model and we dont know where that will be as both, models and hardware improves.
we do know however where evolution is at right now with our brains, but thats probably not comparable - yet the only thing I can see to make any kind of prediction at all
Isn't Mythos mostly hype though?
https://hacks.mozilla.org/wp-content/uploads/2026/05/securit...
Current local models already compete.
A Qwen3.6-35B-A3B or whatever it's full name is, when on a 3090, can at the very least, with very little fine tuning, compete with Haiku and blows away GPT4.1 (aka, the cheap models).
It might keep up with Sonnet 4.5 with some tinkering.
But long story short: it seems to have better performance and similar quality for a payoff of a year or so compared to cloud models. In the same way you can self host faster/easier/cheaper than cloud hosting, if you are okay with the negatives.
I'm returning my 3090 soon for a R9700 after some more basic benchmarking, since the higher RAM should improve my observations more.
2 replies →
You are missing the point. Parents says the market to win need economical models more than SOTA models. Whoever is running those nuclear simulations is not making as much as Apple.
If we extend this line of thinking, China might be on leading that race.
> Anthropic, OpenAI and Mistral ...
Mistral? I think their "revenues" is something like 1/150th what OpenAI and Anthropic are making.