← Back to context

Comment by throwaway27448

21 hours ago

> China is pursuing these because they cannot compete on the frontier.

? Claude, ChatGPT, etc are heinously expensive for tiny benefits lmao. Local + efficient is clearly the future

> ? Claude, ChatGPT, etc are heinously expensive for tiny benefits lmao

Unfortunately local inference is inefficient, 100s of times more inefficient than cloud. When you answer one request at a time you still have to fetch all active weights into compute units, once every token. When you run a batch of 300, you load it once and compute 300 at a time.

Compared to cloud, local inference is less flexible. You can't scale up 5x or 20x, can't have spikes, and pay for it no matter if you use it or not. But usage factor is very low, like 5%. And to run a decent model your system costs $2000 or more.

AI boosters cling to this notion because it's the only way the massive data center buildouts make any sense at all. I guess you could say the US is winning the frontier AI race. Okay. I'm never going to grant a cloud service access to all the contents of my hard drive, that's just never going to happen, so if you expect me and a lot of people like me who feel similarly to get on this train, you better have a local, lightweight model too or we're not even having a discussion, the answer is just no.

  • The thing is, frontier model providers don’t take your feelings into account even a little bit. It’s totally irrelevant to the discussion about the service they can provide, because that service is predicated on access to high power GPU slices that local models can’t touch. Those providers won’t be in an existential crisis because some people choose the privacy route, it’s a cost of doing business.

    • Right but that service being sold is predicated on products being sold to users, yes? Or are we still pretending that the hyperscalers can just pass the same $20 billion between themselves and that's going to be a growth industry forever?

      3 replies →

> ? Claude, ChatGPT, etc are heinously expensive for tiny benefits lmao. Local + efficient is clearly the future

Corporate America is where the money is, and corporate America will dictate what products are successful by virtue of spend. Individuals aren't going to be paying $100s or $1000s/month en masse for these models but businesses will be. Being local and efficient isn't that important at this stage but even so as American companies continue to scale and invest they'll be able to make those models more local and efficient if the market wants it. Sort of like how you had a big, giant desktop computer and now you've got a super computer in your phone which is in your pocket. Going straight to "local and efficient" means going straight to being behind because at some point, perhaps now even, the local and efficient model won't be able to keep up.

For some reason people think that they somehow know something that Google or Nvidia or whoever, with hundreds of billions of dollars of real money at stake don't already know and it's both amusing and bizarre to see this play out again and again in off-hand comments like "lol tiny benefits".

You buy an iPhone even though the cheap-o Wal-Mart Android phone for $100 "does the same thing". Except that in this case the Android phone just puts you out of business while those spending big money for "tiny benefits" beat you in the market.

  • > You buy an iPhone even though the cheap-o Wal-Mart Android phone for $100 "does the same thing".

    People buy iPhones because of status signalling and network effects, neither of which appears to apply to AI model choice. LLMs are already rapidly on the way to being interchangeable commodities.

    • No they don't, it's not 2008. Anybody off the street can get an iPhone or a free iPhone with a mobile plan. They're commodity products. Even homeless people have them.

      To the extent LLMs are commodity products you're right (so far), but that is limited to the main model providers, such as ChatGPT, Claude, Gemini, &c. with interoperability on cloud platform providers and other technology providers like an Apple offering you a choice of LLM with Siri or something.

      If you want to suggest that some other model is in the same bucket as those primary 3, it goes back to the crappy, cheap phone analogy which is accurate. Yea you can make calls with it, but you make calls better with an iPhone.

      13 replies →