Comment by 7e

3 months ago

No, not at all. If this were true Google would be killing it in MLPerf benchmarks, but they are not.

It’s better to have a faster, smaller network for model parallelism and a larger, slower one for data parallelism than a very large, but slower, network for everything. This is why NVIDIA wins.