Comment by jwr

2 months ago

The author seems unaware of how well recent Apple laptops run LLMs. This is puzzling and puts into question the validity of anything in this article.

96 comments

jwr

gcanyon 2 months ago

If Apple offered a reasonably-priced laptop with more than 24gb of memory (I'm writing this on a maxed-out Air) I'd agree. I've been buying Apple laptops for a long time, and buying the maximum memory every time. I just checked, and I see that now you can get 32gb. But to get 64gb I think you have to spend $3700 for the MBMax, and 128gb starts at $4500, almost 3x the 32gb Air's price.

And as far as I understand it, an Air with an M3 is perfectly capable of running larger models (albeit slower) if it had the memory.

mft_ 2 months ago
You’re not wrong that Apple’s memory prices are unpleasant, but also consider the competition - in this context (running LLMs locally) laptops with large amounts of fast memory that can be purposed for the GPU. This limits you to Apple or one specific AMD processor at present.
An HP Zbook with an AMD 395+ and 128Gb of memory apparently lists for $4049 [0]
An ASUS ROG Flow z13 with the same spec sells for $2799 [1] - so cheaper than Apple, but still a high price for a laptop.
[0] https://hothardware.com/reviews/hp-zbook-ultra-g1a-128gb-rev...
[1] https://www.hidevolution.com/asus-rog-flow-z13-gz302ea-xs99-...
- gcanyon 2 months ago
  
  Yeah, I'm by no means saying that Apple is uniquely bad here -- it's just an issue I've been frustrated by since the first M1 chip, long before local LLMs made it a serious issue. More memory is always a good idea, and too much is never enough.
- subscribed 2 months ago
  
  You can get any low spec laptop that has no soldered DIMMs and just replace them with the maximum supported capacity.
  You don't necessarily need to go the maxed up SKU.
  
  4 replies →
- dehugger 2 months ago
  
  The framework desktop will get you the 395+ and 128gb of ram for 2k USD.
jdprgm 2 months ago

The trick here is buying used. Especially for something like the m1 series there is tremendous value to be had on high memory models where the memory hasn't changed significantly over generations compared the cpus and even m1's are quite competent for many workloads. Got a m1 max 64gb ram recently for I think $1400.
jwr 2 months ago

I think pricing is just one dimension of this discussion — but let's dive into it. I agree it's a lot of money. But what are you comparing this pricing to?
From what I understand, getting a non-Apple solution to the problem of running LLMs in 64GB of VRAM or more has a price tag that is at least double of what you mentioned, and likely has another digit in front if you want to get to 128GB?
fnord77 2 months ago
it's astonishing how apple gouges on the memory and ssd upgrade prices (I'm on an M1 w/ 64Gb/4Tb).
That said they have some elasticity when it comes to the DRAM shortage.
- Eggpants 2 months ago
  
  The M-series unified memory is built into the chip itself, not separate components. Of course Apple is going to maintain their margins, but it’s easy to see why with this design more memory is more expensive than drams. Well maybe not with the current market pricing which hopefully is temporary.
- aurareturn 2 months ago
  
  They gouge you on RAM and SSD but provide a far better overall machine for the price than Windows laptops.

fancyfredbot 2 months ago

I think the author is aware of Apple silicon. The article mentions the fact Apple has unified memory and that this is advantageous for running LLMs.

dangus 2 months ago
Then idk why they say that most laptops are bad at running LLMs, Apple has a huge marketshare in the laptop market and even their cheapest laptops are capable in that realm. And their PC competitors are more likely to be generously specced out in terms of included memory.
> However, for the average laptop that’s over a year old, the number of useful AI models you can run locally on your PC is close to zero.
This straight up isn’t true.
- literalAardvark 2 months ago
  
  Apple has a 10-18% market share for laptops. That's significant but it certainly isn't "most".
  Most laptops can run at best a 7-14b model, even if you buy one with a high spec graphics chip. These are not useful models unless you're writing spam.
  Most desktops have a decent amount of system memory but that can't be used for running LLMs at a useful speed, especially since the stuff you could run in 32-64GB RAM would need lots of interaction and hand holding.
  And that's for the easy part, inference. Training is much more expensive.
  
  5 replies →
- andai 2 months ago
  
  So I'm hearing a lot of people running LLMs on Apple hardware. But is there actually anything useful you can run? Does it run at a usable speed? And is it worth the cost? Because the last time I checked the answer to all three questions appeared to be no.
  Though maybe it depends on what you're doing? (Although if you're doing something simple like embeddings, then you don't need the Apple hardware in the first place.)
  
  8 replies →
- fancyfredbot 2 months ago
  
  Most laptops have 16GB of RAM or less. A little more than a year ago I think the base model Mac laptop had 8GB of RAM which really isn't fantastic for running LLMs.
- layer8 2 months ago
  
  By “PC”, they mean non-Apple devices.
  Also, macOS only has around 10% desktop market share globally.
  
  3 replies →
- DANmode 2 months ago
  
  > Apple has a huge marketshare in the laptop market
  Hello, from outside of California!
  
  4 replies →

whazor 2 months ago

But economically, it is still much better to buy a lower spec't laptop and to pay a monthly subscription for AI.

However, I agree with the article that people will run big LLMs on their laptop N years down the line. Especially if hardware outgrows best-in-class LLM model requirements. If a phone could run a 512GB LLM model fast, you would want it.

m4rtink 2 months ago
Are you sure the subscription will still be affordable after the venture capital flood ends and the dumping stops?
- nl 2 months ago
  
  100% yes.
  The amount of compute in the world is doubling over 2 years because of the ongoing investment in AI (!!)
  In some scenario where new investment stops flowing and some AI companies go bankrupt all that compute will be looking for a market.
  Inference providers are already profitable so with cheaper hardware it will mean even cheaper AI systems.
  
  10 replies →
- TeMPOraL 2 months ago
  
  Doesn't matter now. GP can revisit the math and buy some hardware once the subscription prices actually grow too high.
- solatic 2 months ago
  
  You have to remember that companies are kind of fungible in the sense that founders can close old companies and start new ones to walk away from bankruptcies in the old companies. When there's a bust and a lot of companies close up shop, because data centers were overbuilt, there's going to be a lot of GPUs being sold at firesale prices - imagine chips sold at $300k today being sold for $3k tomorrow to recoup a penny on the dollar. There's going to be a business model for someone buying those chips at $3k, then offering subscription prices at little more than the cost of electricity to keep the dumped GPUs running somewhere.
  
  1 reply →
- anonzzzies 2 months ago
  
  They will go down. Or the company will be gone.
seanmcdirmid 2 months ago
Running an LLM locally means you never have to worry about how many tokens you've used, and also it allows for a lot of low latency interactions on smaller models that can run quickly.
I don't see why consumer hardware won't evolve to run more LLMs locally. It is a nice goal to strive for, which consumer hardware makers have been missing for a decade now. It is definitely achievable, especially if you just care about inference.
- KellyCriterion 2 months ago
  
  isnt this what all these NPUs are created for?
  
  1 reply →
ignoramous 2 months ago
> economically, it is still much better to buy a lower spec't laptop and to pay a monthly subscription for AI
Uber is economical, too; but folks prefer to own cars, sometimes multiple.
And how there's market for all kinds of vanity cars, fast sportscars, expensive supercars... I imagine PCs & Laptops will have such a market, too: In probably less than a decade, may be a £20k laptop running a 671b+ LLM locally will be the norm among pros.
- subjectsigma 2 months ago
  
  > Uber is economical, too
  One time I took an Uber to work because my car broke down and was in the shop and the Uber driver (somewhat pointedly) made a comment that I must be really rich to commute to work via Uber because Ubers are so expensive
  
  1 reply →
- joshred 2 months ago
  
  Paying $30-$70/day to commute is economical?
  
  8 replies →
NooneAtAll3 2 months ago

any "it's cheaper to rent than to own" arguments can be (and must be) completely disregarded due to experience of the last decade
so stop it

azuanrb 2 months ago

You still need ridiculously high spec hardware, and at Apple’s prices, that isn’t cheap. Even if you can afford it (most won't), the local models you can run are still limited and they still underperform. It’s much cheaper to pay for a cloud solution and get significantly better result. In my opinion, the article is right. We need a better way to run LLMs locally.

onion2k 2 months ago
You still need ridiculously high spec hardware, and at Apple’s prices, that isn’t cheap.
You can easily run models like Mistral and Stable Diffusion in Ollama and Draw Things, and you can run newer models like Devstral (the MLX version) and Z Image Turbo with a little effort using LM Studio and Comfyui. It isn't as fast as using a good nVidia GPU or a cloud GPU but it's certainly good enough to play around with and learn more about it. I've written a bunch of apps that give me a browser UI talking to an API that's provided by an app running a model locally and it works perfectly well. I did that on an 8GB M1 for 18 months and then upgraded to a 24GB M4 Pro recently. I still have the M1 on my network for doing AI things in the background.
- liuliu 2 months ago
  
  You can run newer models like Z Image Turbo or FLUX.2 [dev] using Draw Things with no effort too.
jki275 2 months ago
I bought my M1 Max w/ 64gb of ram used. It's not that expensive.
Yes, the models it can run do not perform like chatgpt or claude 4.5, but they're still very useful.
- mirror_neuron 2 months ago
  
  I’m curious to hear more about how you get useful performance out of your local setup. How would you characterize the difference in “intelligence” of local models on your hardware vs. something like chatgpt? I imagine speed is also a factor. Curious to hear about your experiences in as much detail as you’re willing to share!
  
  1 reply →
almosthere 2 months ago
749 for an M4 air at Amazon right now
- tossandthrow 2 months ago
  
  Try running anything interesting on these 8gb of ram.
  You need 96gb or 128gb to do non trivial things. That is not yet 749 usd
  
  7 replies →
whitehexagon 2 months ago

I was pleasantly surprised at the speed and power of my second hand M1 Pro 32GB running Asahi & Qwen3:32B. It does all I need, and I dont mind the reading pace output, although I'd be tempted by M2 Ultra if the secondhand market hadn't also exploded with the recent RAM market manipulations.
Anyway, I'm on a mission to have no subscriptions in the New Year. Plus it feels wrong to be contributing towards my own irrelevance (GAI).

dangus 2 months ago

Yeah, any Mac system specced with a decent amount of RAM since the M1 will run LLMs locally very well. And that’s exactly how the built-in Apple Intelligence service works: when enabled, it downloads a smallish local model. Since all Macs since the M1 have very fast memory available to the integrated GPU, they’re very good at AI.

The article kinda sucks at explaining how NPUs aren’t really even needed, they just have potential to make things more efficient in the future rather than depending on the power consumption involved with running your GPU.

terafo 2 months ago

This article specifically talks about PC laptops and discusses changes in them.

cmxch 2 months ago

Only if you want to take all the proprietary baggage and telemetry that comes with Apple platforms by default.

A Lenovo T15g with a 16gb 3080 mobile doesn’t do too badly and will run more than just Windows.

pimeys 2 months ago

I just got a Framework desktop with 128 GB of shared RAM just before the memory prices rocketed, and I can comfortably run many even bigger oss models locally. You can dedicate 112GB to the GPU and it runs Linux perfectly.

selinkocalar 2 months ago

The M-series chips really changed the game here

reactordev 2 months ago

This article is to sell more laptops.