Comment by miki123211
2 days ago
> We are already at the limit of how small we can scale chips
I strongly suspect this is not true for LLMs. Once progress stabilizes, doing things like embedding the weights of some model directly as part of the chip will suddenly become economical, and that's going to cut costs down dramatically.
Then there's distillation, which basically makes smaller models get better as bigger models get better. You don't necessarily need to run a big model al of the time to reap its benefits.
> so unless the price of electricity comes down exponentially
This is more likely than you think. AI is extremely bandwidth-efficient and not too latency-sensitive (unlike e.g. Netflix et al), so it's pretty trivial to offload AI work to places where electricity is abundant and power generation is lightly regulated.
> Most companies are already running AI models at a loss, scaling the models to be bigger(like GPT 4.5) only makes them more expensive to run.
"We're profitable on inference. If we didn't pay for training, we'd be a very profitable company." Sam Altman, OpenAI CEO[1].
[1] https://www.axios.com/2025/08/15/sam-altman-gpt5-launch-chat...
>doing things like embedding the weights of some model directly as part of the chip will suddenly become economical, and that's going to cut costs down dramatically.
An implementation of inference on some specific ANN in fixed function analog hardware can probably pretty easily beat a commodity GPU by a couple orders of magnitude in perf per watt too.
> "We're profitable on inference. If we didn't pay for training, we'd be a very profitable company."
That's OpenAI (though I'd be curious if that statement holds for subscriptions as opposed to API use). What about the downstream companies that use OpenAI models? I'm not sure the picture is as rosy for them.