Comment by simonw
18 hours ago
I think they mean that the DeepSeek API charges are less than it would cost for the electricity to run a local model.
Local model enthusiasts often assume that running locally is more energy efficient than running in a data center, but fail to take the economies of scale into account.
> Local model enthusiasts often assume that running locally is more energy efficient than running in a data center,
It is a well known 101 truism in /r/Localllama that local is rarely cheaper, unless run batched - then it is massively, 10x cheaper indeed.
> I think they mean that the DeepSeek API charges are less than it would cost for the electricity to run a local model.
Because it is hosted in China, where energy is cheap. In ex-USSR where I live it is inexpensive too, and keeping in mind that whole winter I had to use small space heater, due to inadequacy of my central heating, using local came out as 100% free.
Some of those local model enthusiasts can actually afford solar panels.
You are still incurring a cost if you use the electricity instead of selling it back to the grid
The extent of that heavily depends on where you are. Where I live in NZ, the grid export rates are very low while the import rates are very high.
Our peak import rate is 3x higher than our solar export rate. In other words, we’d need to sell 3 kWh hours of energy to offset the cost of using 1 kWh at peak.
We’re currently in the process of accepting a quote for home batteries. The rates here highly incentivise maximising self-use.
Selling it back to the grid is something that is still possible but much, much less of a financially sound proposition than it was a few years ago because of regulatory capture by the utilities. In some places it is so bad that you get penalized for excess power. Local consumption is the fastest way to capitalize on this, more so if you can make money with that excess power.
Luxembourg: Purchase price = 2 x sales price, mostly due to grid costs.
And this is with no income tax or VAT on sold electricity.
Local enthusiasts don’t have to fear account banning.
I guess it mostly comes from using the model with batch-size = 1 locally, vs high batch size in a DC, since GPU consumption don't grow that much with batch size.
Note that while a local chatbot user will mostly be using batch-size = 1, it's not going to be true if they are running an agentic framework, so the gap is going to narrow or even reverse.
Well, different parts of the world also have different electricity prices.
Usually not multiple orders of magnitude difference though.
Is it economies of scale, or is it unpaid externalities?