Comment by grw_
4 hours ago
Ah right, yes- I think we're talking about the same thing- this driver just chooses to pretend to be a RoCE v2 device (instead of e.g MLX Nic in IB mode), but nothing would change if it did I think. Or at least thats what the libibverbs abstraction promises.
There's no IB OR Ethernet underneath- I could have implemented this properly as it's own distinct transport kind, but seemed easier just to pretend to be something that is already known.
The 'the chip that understands both TB and IB and translate RDMA requests between the two' in this instance is your CPU, so orders-of-magnitude worse latency than an ASIC, but still better than anything on top of IP/Ethernet. I think there's also potential to do device-initiiated RDMA, where e.g GPU itself can write to some mailbox and have message appear across the abstracted transport in another GPUs mailbox. Even if the CPU is involved in shuffling pointers across mailboxes it doesn't necessarily mean it'll be a bottleneck
No comments yet
Contribute on Hacker News ↗