Comment by ungreased0675
10 hours ago
Right now LLMs are heavily subsidized. When that ends, the actual cost of the service may exceed its usefulness for many use cases.
10 hours ago
Right now LLMs are heavily subsidized. When that ends, the actual cost of the service may exceed its usefulness for many use cases.
Computation halves in price every ~2 years so maybe in the short term but not in the long term
How is that possible when the cost of memory and hard drives have gone up 3x+ in the last six months? Maybe cheaper if you're OAI or one of the lucky companies Nvidia is propping up. Everyone else is getting screwed.
I'm less sure of the fact that ending subsidized token consumption (in isolation) will happen and change this. I think we've seen this play out before with other tech companies where discounting early use ends up entrenching demand and allowing the company to build larger and more efficient infrastructure.
I'm slightly _more_ convinced (still not all that strongly) that the rising cost of memory and chips, data center construction that gets outpaced by computing demand, increasing energy costs, and low switching costs for customers will force the model labs to make changes that increase the barrier to entry (either via higher pricing, more restrictive rate limiting, etc.). or force their customers into longer term commitments.
> I think we've seen this play out before with other tech companies where discounting early use ends up entrenching demand and allowing the company to build larger and more efficient infrastructure.
We've also seen failures who were convinced "they would make it up in volume." I guess the bet is that infra will get that much more efficient, but it's not clear how much slack there is.
A lot - and over the coming 2 years, even more. Utilization rates are under 50% across the board, and special and cheaper chips are coming out all the time for inference. And a truckload of research - TurboQuant, HC (deepseek), etc, etc..