← Back to context

Comment by flir

18 hours ago

I do. "Commoditize your complement". Want to sell lots of silicon? Give away good local models to run on that silicon.

Even if SOTA models in the cloud are a few percentage points better, most work can be routed to local models most of the time. That leaves the cloud providers fighting over the most computationally intensive tasks. In the long term, I think models are going to be local-first.

(Unless providers can figure out a network effect that local models can't replicate).

> I think models are going to be local-first.

Why on earth would that happen when everything else is moving into the cloud to tie it to ever-escalating subscription fees and prevent piracy?

Even with gaming, where running high-end 3D games in the cloud seems like madness and inevitably degrades the quality of the experience, they won't stop trying.

> In the long term, I think models are going to be local-first.

Why? There's an inherent efficiency advantage to scale, while the only real advantage for local models (privacy/secrecy) hasn't proven convincing for broader IT either.

  • It's foolish not to care about privacy especially as a company. You know how it prevents you from emailing yourself your tax documents? Meanwhile thousands of employees are sending literal design docs, software, product goals, etc to several ai third partys. Not only is that insane, the companies they are sending it too intend too and openly admit to scanning the data, make software products themselves, and intend to create models that can produce their products automatically.

    The reason local models hasn't caught on is several fold. It's marketing to say your company follows the latest trend, and there's an inherent pressure to keep AI companies afloat so the economy doesn't entirely collapse. The other is, it wasn't until the last month that these models have caught up to frontier models. They just did, and they are more efficient and don't require a team of 500 to deploy.

  • Local first models aren't just more private than the API vendors, they also have the advantages of fixed cost, lower latency, and better stability - local models don't get nerfed/"updated" in the background like chatgpt does.

    Maybe in a world where these AI companies behaved with some semblance of ethics and user-friendliness they would be on even ground, but for anyone paying attention local models are obviously the future.

  • > the only real advantage for local models (privacy/secrecy) hasn't proven convincing for broader IT either

    Because of nonexistent regulation. Just wait for it…

    The legal situation in for example the EU is crystal clear, only that it will take some time to go though all court instances.

  • To not depend on an external company that can decide the price.

    • That's a silly reason. For non-agent use cases what kind of utilization are you going to average on your own GPU, 5-10%? And that's without batching.

      Even with overhead and scaling for peak use and a large profit margin, any company with an ounce of competition will be vastly cheaper than self-hosting. And for models you can run yourself, there will be plenty of competition.

      2 replies →