Comment by codazoda

21 hours ago

Today the Mini tops out at 48GB. Gotta go to the Studio to get 64GB.

5 comments

codazoda

Don't buy the Mini or Studio. Both have the M4 which lacks the Neural Accelerators, making prompt processing ~3-4x slower.

mortenjorck 21 hours ago
I assume those don't just work automatically with an off-the-shelf gguf. What do you need in your local inference stack to take advantage of M5's neural accelerators?
- wren6991 11 hours ago
  
  Apple muddied the waters by calling them "neural accelerators" but it seems like what they actually added in the M5 generation is tensor instructions for the existing GPU cores. It's not a separate accelerator like the ANE.
  llama.cpp's Metal backend does use them when they're available.
- aurareturn 21 hours ago
  
  They do work with llama.cpp and MLX automatically.
2Gkashmiri 13 hours ago

Apple Mac Studio (M3 Ultra Chip/28 CPU, 60 GPU/96 GB/1 TB
How is this config?