Comment by metadat

17 days ago

The key takeaway is hidden in the middle:

> In extreme cases, on purely CPU bound benchmarks, we’re seeing a jump from < 1Gbit/s to 4 Gbit/s. Looking at CPU flamegraphs, the majority of CPU time is now spent in I/O system calls and cryptography code.

400% increase in throughput, which should translate to a proportionate reduction in CPU utilization for UDP network activity. That's pretty cool, especially for better power efficiency on portable clients (mobile and notebook).

I found this presentation refreshing. Too often, claims about transition to "modern" stacks are treated as being inherently good and do not come with the data to back it up.

Any guesses on whether they have other cases where they get more than 4 Gbps but wasn't CPU bound or was this the fastest they got?

  • _Author here_.

    4 Gbit/s is on our rather dated benchmark machines. If you run the below command on a modern laptop, you likely reach higher throughput. (Consider disabling PMTUD to use a realistic Internet-like MTU. We do the same on our benchmark machines.)

    https://github.com/mozilla/neqo

    cargo bench --features bench --bench main -- "Download"

i wonder if we'll ever see hardware accelerated cross-context message passing for user and system programs.