← Back to context Comment by phonon 6 hours ago M3 Ultra has a 1024 bit memory bus (819 GB/s) and starts at $3,999 (96GB of RAM). It can be done.... 2 comments phonon Reply bigyabai 6 hours ago The tradeoff is that the M3 Ultra's GPU loses to laptop GPUs in compute benchmarks. All of that bandwidth is wasted idling for token prefill.For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth. Schiendelman 22 minutes ago With the M6 theoretically coming later this year, Apple seems to be realizing they need to catch up with more lanes of GPU.
bigyabai 6 hours ago The tradeoff is that the M3 Ultra's GPU loses to laptop GPUs in compute benchmarks. All of that bandwidth is wasted idling for token prefill.For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth. Schiendelman 22 minutes ago With the M6 theoretically coming later this year, Apple seems to be realizing they need to catch up with more lanes of GPU.
Schiendelman 22 minutes ago With the M6 theoretically coming later this year, Apple seems to be realizing they need to catch up with more lanes of GPU.
The tradeoff is that the M3 Ultra's GPU loses to laptop GPUs in compute benchmarks. All of that bandwidth is wasted idling for token prefill.
For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth.
With the M6 theoretically coming later this year, Apple seems to be realizing they need to catch up with more lanes of GPU.