Comment by temp0826

15 hours ago

Probably a good thing SLI fell out of fashion. No consumer boards with multiple 16x, but a few with 2 8x (gated behind a "mode" switch). A few years ago it was looking like we were on our way to full 4 16x slots. For cuda/llm/whatever does it really matter if the cards are in 1x slots?

It's the other way around. SLI falling out of fashion is why there are no consumer boards with multiple x16 slots. There's no longer any demand for it on the consumer side, so the CPU vendors only provide lots of PCIe lanes for expensive chips.

On the server side, seven x16 slot motherboards exist.

I would expect x8 at 5.0 speeds to be plenty for SLI. That's twice as fast as x16 slots were around the end of the SLI era.

GPUs in 16x slots is still important for LLM stuff, especially multi-GPU, where lots of data needs to move between cards during computation.

  • A 16x PCIE 6.0 setup has more bandwidth than any dual channel DDR5 memory kit.

  • Depends on what you're doing. I'm pretty sure the bandwidth for inference isn't much.

    • Depends, if it's tensor parallel or pipeline parallel. Only PP doesn't pass too much. TP does

... shouldn't the logic be opposite? "Bad that SLI went out of fashion, there's no way for two GPUs to communicate fast over pcie, and SLI would allow such fast bridge"

  • Whether or not SLI remained viable for gaming, Broadcom was going to jack up the prices on PCIe switches to the enterprise-only range. That's the real reason why consumer motherboards don't have more GPU slots. Mainstream consumer CPU sockets never had a wealth of PCIe lanes, there was just a brief span of years where PCIe switches were cheap so high-end consumer boards could offer several x8 or x16 slots (sharing bandwidth in ways that make diagrams like these important).

    In previous decades, non-mainstream CPU sockets were also more accessible to consumer budgets; first-gen Threadripper started at only 8 cores, so it was possible to pay extra for more memory channels and IO lanes without also buying an excess of CPU cores. But that had little to do with the popularity or viability of multi-GPU consumer systems.

    • But PCIe switches are now more common than ever. How else do you think those high-end consumer boards are able to provide six M.2 slots?

      2 replies →