Comment by brucehoult

17 hours ago

> it’s not even clear if it can saturate a gigabit

If that's the case then it's not the CPU's fault. I can't open the linked site but assuming it's really the same as a BPI-F3 i.e. a SpacemiT K1 chip, that can do 2.8 GB/sec on large RAM to RAM memcpy using a CPU core i.e. 44 Gbps total, 22 Gbps each read and write. Plus I assume it's got DMA so no need to involve the CPU anyway.

Here is a test I ran in April 2025 on a Sipeed LicheePi 3A same chip).

https://hoult.org/K1_memcpy.txt

> RISC-V is quite wimpy this far

The new K3 chip from the same manufacturer does 8.7 GB/s RAM to RAM memcpy using a dual issue in-order A100 ("AI") core, just over 3x faster.

Sure this pales in comparison to recent Apple / Intel / AMD but it's a lot faster than home networking.

Although your benchmark is interesting, I don't think it's very relevant here. In my experience, you'll saturate the CPU through packet decoding, routing, and firewalling long before memory becomes a bottleneck.

That's why all network SoCs have hardware to accelerate such thing, otherwise in software alone they can barely handle simple routing at a few hundred mbps.

That chip doesn't seem to have that: https://cdn-resource.spacemit.com/file/chip/K1/K1_datasheet_...

  • 1 Gb/s is only ~100,000 packets/s at standard MTU. You literally get 10 us/packet which is a eternity. Normal fast-path router operation only really needs to consider the header of <100 bytes/packet, so you are getting ~100 ns of compute per byte of considered data and on even a 1 Ghz processor you are getting over 100 instructions per byte of considered data. Failure to achieve a measly 1 Gb/s really says more about those software implementations than it says anything about the impossibility or difficulty of the problem.