Comment by scubbo

7 months ago

> LLMs are cheap enough to run profitably on ads alone

> It is even cheaper to serve an LLM answer than call a web search API

These, uhhhh, these are some rather extraordinary claims. Got some extraordinary evidence to go along with them?

23 comments

scubbo

I've operated a top ~20 LLM service for over 2 years, very comfortably profitably with ads. As for the pure costs you can measure the cost of getting an LLM answer from say, OpenAI, and the equivalent search query from Bing/Google/Exa will cost over 10x more...

johnecheck 7 months ago
So you don't have any real info on the costs. The question is what OpenAI's profit margin is here, not yours. The theory is that these costs are subsidized by a flow of money from VCs and big tech as they race.
How cheap is inference, really? What about 'thinking' inference? What are the prices going to be once growth starts to slow and investors start demanding returns on their billions?
- jsnell 7 months ago
  
  Every indication we have is that pay-per-token APIs are not subsidized or even break-even, but have very high margins. The market dynamics are such that subsidizing those APIs wouldn't make much sense.
  The unprofitability of the frontier labs is mostly due to them not monetizing the majority of their consumer traffic at all.
- etaioinshrdlu 7 months ago
  
  It would be profitable even if we self-hosted the LLMs, which we've done. The only thing subsidized is the training costs. So maybe people will one day stop training AI models.
clarinificator 7 months ago
Profitably covering R&D or profitably using the subsidized models?
- guappa 7 months ago
  
  He was doing neither. He was using a 3rd party API and has no idea what it costs them to actually run it.
throwawayoldie 7 months ago
So you're not running an LLM, you're running a service built on top of a subsidized API.

haiku2077 7 months ago

https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch..., also note the "objections" section

Anecdotally thanks to hardware advancements the locally-run AI software I develop has gotten more than 100x faster in the past year thanks to Moore's law

oblio 7 months ago
What hardware advancement? There's hardly any these days... Especially not for this kind of computing.
- Sebguer 7 months ago
  
  Have you heard of TPUs?
  
  9 replies →
- haiku2077 7 months ago
  
  Specifically, I upgraded my mac and ported my software, which ran on Windows/Linux, to macos and Metal. Literally >100x faster in benchmarks, and overall user workflows became fast enough I had to "spend" the performance elsewhere or else the responses became so fast they were kind of creepy. Have a bunch of _very_ happy users running the software 24/7 on Mac Minis now.
  
  1 reply →