Comment by vasilvv
4 hours ago
> though going the in-house approach would warrant keeping an eye on the relevant kernel commits like a hawk to avoid missing bug fixes like these. These in-house implementations tend to have less eyeballs than the kernel.
This is somewhat funny to read because this specific issue in CUBIC (sudden CWND jump upon existing quiescence) was originally discovered in Google's QUIC library and then later reported to the team working on the TCP stack. I know this because I was the one who found that bug back in 2015.
That said, congestion control algorithms are really prone to logic bugs, and very subtle changes in the algorithm can often lead to dramatically different outcomes. Because of that, there's a lot of value in running congestion control code that has been tested on a wide variety of real Internet traffic.
Would formal validation of these algorithms (e.g. with TLA+) help avoid such bugs?
I think a audited algorithm where each type is strictly defined like int32 added to that really help with what exactly should be inputted to it so it remains correct.