← Back to context Comment by zozbot234 10 hours ago SOTA models are reportedly MoE, not dense. 1 comment zozbot234 Reply bigyabai 37 minutes ago A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.
bigyabai 37 minutes ago A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.
A 5T MoE model is still bottlenecked by streaming weights from SSD, in addition to compute bottlenecks during prefill and decode.