Comment by almostdeadguy

10 hours ago

I'm less sure of the fact that ending subsidized token consumption (in isolation) will happen and change this. I think we've seen this play out before with other tech companies where discounting early use ends up entrenching demand and allowing the company to build larger and more efficient infrastructure.

I'm slightly _more_ convinced (still not all that strongly) that the rising cost of memory and chips, data center construction that gets outpaced by computing demand, increasing energy costs, and low switching costs for customers will force the model labs to make changes that increase the barrier to entry (either via higher pricing, more restrictive rate limiting, etc.). or force their customers into longer term commitments.

> I think we've seen this play out before with other tech companies where discounting early use ends up entrenching demand and allowing the company to build larger and more efficient infrastructure.

We've also seen failures who were convinced "they would make it up in volume." I guess the bet is that infra will get that much more efficient, but it's not clear how much slack there is.

  • A lot - and over the coming 2 years, even more. Utilization rates are under 50% across the board, and special and cheaper chips are coming out all the time for inference. And a truckload of research - TurboQuant, HC (deepseek), etc, etc..