← Back to context

Comment by 7e

21 hours ago

No, not at all. If this were true Google would be killing it in MLPerf benchmarks, but they are not.

It’s better to have a faster, smaller network for model parallelism and a larger, slower one for data parallelism than a very large, but slower, network for everything. This is why NVIDIA wins.