Comment by segmondy

12 days ago

Seriously? Does Netflix or Spotify cost the same everywhere around the world? They earn less and their buying power is less.

The costs of Netflix and Spotify are licensing. Offering the subscription at half price to additional users is non-cannibalizing and a way to get more revenue from the same content.

The cost of LLMs are the infrastructure. Unless someone can buy/power/run compute cheaper (Google w/ TPUs, locales with cheap electricity, etc), there won't be a meaningful difference in costs.

  • That assumes inference efficiency is static, which isn't really the case. Between aggressive quantization, speculative decoding, and better batching strategies, the cost per token can vary wildly on the exact same hardware. I suspect the margins right now come from architecture choices as much as raw power costs.

Sure so do professional tools like Microsoft teams or compute in different places of the world.