← Back to context

Comment by datadrivenangel

5 days ago

Rounding everything down in the most optimistic setting got me to $0.40 per million tokens, and openrouter has the same model at $.38/mtok.

But once all that is done you still own a Mac in one case, and you don’t in the other, correct?

  • Even at just the electricity cost openrouter will be both

    1) Roughly break-even to a little bit cheaper per token cost 2) Much, much, faster

    So the cost of the mac barely even matters, it's just an extra cost beyond.

    Sure, data center providers can pay lower rates.

    The point of this article is that LLMs at home really don't make a ton of sense, unless you are willing to pay through the nose for privacy. There is absolutely no cost saving to be had.

    If you're looking at your own datacenter as a larger corporate client, that could change.

    There are also some providers that will contractually keep your data private, like AWS Bedrock or parts of Google/Azure (I don't know their stack names).

    AWS even has AWS Secret Region and AWS Top Secret Region if you want to use LLMs on classified data.

    You have to value privacy at a roughly absurd level to not want to use LLMs run efficiently at scale by someone else. For the home user, just the extra efficiency produced by batching requests from a large number of users in a datacenter in a real win.

    Some of these companies are even selling tokens below cost to get marketshare. If someone will sell you a service for a dollar bill or three quarters, why wouldn't you take the three quarters?

    • > If someone will sell you a service for a dollar bill or three quarters, why wouldn't you take the three quarters?

      Because one day they'll send you an email informing you the new rate is $1.50, and if you missed the email, that's not their problem.

  • Not always. The calculations take its useful life expectancy as an input. If they estimate it correctly you have highly likelihood of it breaking, burning out or being woefully out of date by the end. At the 10 year window you are looking at losing support for security updates.

    So if you are lucky you might end up with something that still runs but most folks won't find it particularly useful

  • Yea this; it’s the same reason why mortgaging is cheaper than renting

I’ll keep my data local over a $.02/mtok difference.

  • It’s more than just data locality. OpenRouter is faster, no? I have an M4 pro, and anything but the smallest dumbest models are unusably slow for interactive use. I personally haven’t yet found a good use case for offline/non-interactive LLM work locally.

    • Yeah. The speed is the biggest issue. The intelligence of open models is good enough for serious work (though still worse than the frontier models), but the cloud models are often 3-7 times faster, and you can get more parallelization and so get speeds on the order of hundreds of tokens per second, which makes things fast!

      3 replies →

    • And continuing the argument of "more than just...", if you stopped inferencing on your Mac you still have a generally nice computer. The difference between rent vs buy.

    • I played with classifying and summarizing my entire email history (per email) with small models, but that only took about 12h of GPU time at most. Using a coding agent cli wrapper in that case is far slower because of all the spin up cost and the system prompt they inject even if you want to turn it all off.

      If I used an actual direct API it probably would've been much faster, but I'm doing it for hobby / fun reasons. You also get to fiddle with a lot more params.

    • I’m running a local Whisper + Gemma 4 pipeline with a cheap USB mic to extract health related data and potential todos from ambient speech. It doesn’t have to be fast doesn’t have to be 100% correct because if it captures at least a few bits of interesting information that would otherwise go unnoticed it’s still a win.

      1 reply →

What is it with AI SaaS naming themselves "openxyz" when there is 0% open about them?

  • They learnt from ooenai that naming yourself open-xyz doesn't actually require opening anything.

  • It's the next co-opted buzzword after "democratize".

    • Yeah, anytime that I see "People's" or "Democratic" in the name of a nation, I grow suspicious. It is rarely a well-functioning democracy.

  • It's how marketing works. If something is a problem they have to loudly claim to have fixed it. Look around the economy and you'll see lots of it. "Healthy" (high sugar) muesli bars, clean-diesel, surveillance wrapped up as keeping us safe. The modus operandi of marketing is to change minds about self evident things otherwise what is the point?

Also many have power even cheaper or even free unused surplus power with solar.

I don't do local inference other than hobby & learning reasons because electricity is so expensive where I am at.