Comment by zozbot234
6 hours ago
The underlying advantage of local inference is that you're repurposing your existing hardware for free. You don't need your token spend to pay a share of the capex cost for datacenters that are large enough to draw gigawatts in power, you can just pay for your own energy use. Even though the raw energy cost per operation will probably be higher for local inference, the overall savings in hardware costs can still be quite real.
No comments yet
Contribute on Hacker News ↗