← Back to context Comment by zozbot234 11 hours ago SOTA models are reportedly MoE, not dense. 1 comment zozbot234 Reply bigyabai 2 hours ago A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.
bigyabai 2 hours ago A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.
A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.