← Back to context

Comment by zozbot234

4 hours ago

The underlying advantage of local inference is that you're repurposing your existing hardware for free. You don't need your token spend to pay a share of the capex cost for datacenters that are large enough to draw gigawatts in power, you can just pay for your own energy use. Even though the raw energy cost per operation will probably be higher for local inference, the overall savings in hardware costs can still be quite real.