Comment by dist-epoch
1 hour ago
128 GB at 600 GB/s for this versus 32 GB at 1800 GB/s for 5090.
This is much better value than 5090, you can run much bigger models.
1 hour ago
128 GB at 600 GB/s for this versus 32 GB at 1800 GB/s for 5090.
This is much better value than 5090, you can run much bigger models.
Here's a pretty detailed breakdown of this [1]:
> tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why?
I've read a number of things from which the consensus seems to be that yes you can run a larger model and/or have more context with a 128GB+ Mac but the performance gap is still massive and with current hardware we're still talking about inference rates that matter. By this I mean there's a big difference between 10tok/s vs 30. Once we get to t apoint where it's 100 vs 300, it won't be as big of a deal, a bit like FPS in games.
Oh and there are similar concerns with the DGX Spark [2].
[1]: https://www.reddit.com/r/LocalLLaMA/comments/1t5v2gr/need_ad...
[2]: https://www.reddit.com/r/LocalLLaMA/comments/1sqk333/dgx_spa...