Comment by applfanboysbgon
5 days ago
Unless I'm misunderstanding, this is counting the entire laptop in the cost of generating tokens. The calculation seems to omit that, in addition to receiving LLM output, you have also received a laptop in exchange for your money. If you intend to put this machine in a dark corner and run it solely as a token-munching server, a laptop would be an exceptionally poor choice of technology for this purpose. But if you intend to use the laptop as a laptop, having a laptop is a pretty big benefit over not having a laptop.
You also get the benefit of privacy, freedom from censorship, and control over the model used (i.e. it will not be rugpulled on you in three months after you've built a workflow around a specific model's idiosyncrasies).
Yeah, a better metric might be, the difference in cost between the laptop you need to run local models, and the laptop you would have bought anyway.
The base 14" m5 MacBook pro is $1700 with 16gb/1tb. The author's spec is $4300 - $2600 more.
It depends on how often you use it (and your tolerance for slow inference) and whether you would have otherwise bought a higher spec. For my needs, this costs a LOT more.
> control over the model used
but you lose access to the most capable models, you can run only the small ones
Sure. I have unlimited credits for AI with my job, but I’m learning by running things locally. Start by running OpenRouter, then downloading open source tools to run locally, then modifying some python, now creating pipelines in rust. Yes, the irony is that I’m using Opus to crack these things open.
And they run slower and quantized.
> in addition to receiving LLM output, you have also received a laptop in exchange for your money
And, since it's a Mac, whenever you're ready to upgrade it'll still have a fairly decent resale value.
How is it going to have decent resale when you bought it for $15,000 over MSRP because of a time-limited LLM mass-psychosis?
https://www.ebay.com/sch/i.html?_nkw=apple+mac+studio+m3+ult...
I think you'd need to tinker quickly, realize anything with CUDA (other than the awful DGX Spark) is better for learning, the prefill is killing your ability to actually run models large enough to justify that RAM, and then cure yourself before the rest of the crowd.
https://www.apple.com/shop/buy-mac/mac-studio/m3-ultra-chip-...
https://www.apple.com/shop/buy-mac/macbook-pro/16-inch-space...
Well obviously I'm not talking about buying over MSRP.
5 replies →
OpenRouter can’t play Cyberpunk 2077 max setting 5K HDR!
OP is giving you the absolute best case compared to most of the people who've been overcome with psychosis hoarding Macs.
An unreasonable number of these people spent $10,000+ for Mac Studios that are still compute bottlenecked and don't have anything more efficient than Gemma 4 to run.