← Back to context

Comment by eqvinox

12 hours ago

> This is using Thunderbolt networking as transport,

Are you sure? It doesn't sound like it in some places in the text, e.g.:

>> a kernel driver that sits alongside thunderbolt-net, allocating DMA rings from the controller's NHI port in the same way

but I don't have the domain knowledge to tell…

Yes, the description from TFA does not match the traditional Thunderbolt networking protocol, whose performance may be as low as that of a 10 Gb/s Ethernet interface.

The description from TFA matches what the poster above you said about a new Linux device driver that allows access to the raw Thunderbolt protocol for transferring data between computers. This appears to be an independent implementation of the same principle as in the device driver that will be merged in the mainline Linux.

While the official Linux device driver makes the raw Thunderbolt appear like a file, which can be written and read to transfer data, this implementation emulates an Infiniband interface, which presumably was simpler to use for distributing work over multiple GPUs.

They actually mention that with traditional Thunderbolt networking on the same computers, they had obtained only 9 Gb/s, i.e. more than 5 times slower than what they obtained with raw Thunderbolt.

  • > traditional Thunderbolt networking protocol ... performance may be as low as that of a 10 Gb/s Ethernet interface.

    Ouch. Why so much lower than the physical bandwidth (or what they've achieved here)?

    • A USB4 40Gbps cable consists of two 20G tx/rx pairs. The in-kernel networking implementation is single-stream and just uses one pair, and won't e.g. stripe across both pairs or across multiple cables, which was the main bandwidth unlock in TFA. Doing so would be a much more complicated undertaking, since now you've re-introduced out-of-order delivery which complicates re-assembly of large packets, retries, handling loss etc. The verbs interface is a lot simpler than that of a full IP stack, so although was possible to get this working across rails, may not be so simple for something pretending to be ethernet.

      2 replies →