← Back to context

Comment by jsnell

6 days ago

Inference is actually quite cheap. Like, a highly competitive LLM can cost 1/25th of a search query. And it is not due to inference being subsidized by VC money.

It's also getting cheaper all the time. Something like 1000x cheaper in the last two years at the same quality level, and there's not yet any sign of a plateau.

So it'd be quite surprising if the only long-term business model turned out to be subscriptions.

Can you link to any sources that support your claim?

  • Sure. Here's something I'd written on the subject that I'd left lying in my drafts folder for a month, but I've now published just for you :)

    https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch...

    It has links to public sources on the pricing of both LLMs and search, and explains why the low inference prices can't be due the inference being subsidized. (And while there are other possible explanations, it includes a calculator for what the compound impact of all of those possible explanations could be.)

    • Thanks for sharing!

      It's worthwhile to note that https://github.com/deepseek-ai/open-infra-index/blob/main/20... shows cost vs. theoretical income. They don't show 80% gross margins and there's probably a reason they don't share their actual gross margin.

      OpenAI is the easiest counterexample that proves inference is subsidized right now. They've taken $50B in investment; surpassed 400M WAUs (https://www.reuters.com/technology/artificial-intelligence/o...); lost $5B on $4B in revenue for 2024 (https://finance.yahoo.com/news/openai-thinks-revenue-more-tr...); and project they won't be cash-flow positive until 2029.

      Prices would be significantly higher if OpenAI was priced for unit profitability right now.

      As for the mega-conglomerates (Google, Meta, Microsoft), GenAI is a loss leader to build platform power. GenAI doesn't need to be unit profitable, it just needs to attract and retain people on their platform, ie you need a Google Cloud account to use Gemini API.

      3 replies →

    • Just had a quick glance, but I think I found something to add to the Objection!-section of your post:

      Brave's Search API is 3$ CPM and includes Web search, Images, Videos, News, Goggles[0]. Anthropic's API is 10$ CPM for Web search (and text only?), excluding any input/output tokens from your model of choice[1], that'd be an additional 15$ CPM, assuming 1KTok per request and Claude Sonnet 4 as a good model, so ~25$ CPM.

      So your default "Ratio (Search cost / LLM cost): 25.0x" seems to be more on the 0.12x side of things (Search cost / LLM cost). Mind you, I just flew over everything in 10 mins and have no experience using either API.

      [0]: https://brave.com/search/api/

      [1]: https://www.anthropic.com/pricing#anthropic-api