Comment by nevereveragain
3 years ago
Really frustrating to see the amount of effort that went in to gaming those benchmarks when catastrophic performance issues in core Microsoft-authored libraries go unfixed for years. https://github.com/dotnet/SqlClient/issues/593
This is my favorite issue at the moment.
It's something that is completely unintuitive because MS wants you to use as much async as possible, provides an async API but is not fixing the damn issue for years. Most senior devs will not know about this, use the methods and probably won't even notice it in most production scenarios. At least not until it's hits them like a ton of bricks.
Another favorite .NET tidbit of mine is the existence of Microsoft.VisualStudio.Threading. If you use a lot of async code with UI you will eventually run into deadlocks. MS noticed that, too, and wrote this library for Visual Studio (hence the name) to work around that and then released it to the public.
Or:
Also fun (but understandable why they don't fix it) BUT with VS2019 they included a new "code fix"/analyzer that very annoyingly tells you to convert the latter into the former for "code readability", making you code slower.
Don't get me wrong, I like .NET a lot but it has quite a few footguns.
Why is the latter faster than the former?
The latter has a fast path for arrays and lists, the former doesn't.
For Entity Framework (Core) and LINQ to SQL it doesn't matter, though.
> It does pretty much what you say: it repeatedly copies data (2 bytes at a time though), without materializing a string. So if the string you try to read is 10 packets long, you end up copying 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 packets. If you need 100 packets, you end up copying them 5050 times. So to read N bytes you copy O(N^2) bytes: it's impossible to scale this way. Whether bytes are copied 2 at a time or using a more optimal algorithm doesn't really matter.
Good lord. It took them about a year to find this obvious O(n^2) hot loop and they’re unable the fix it.
To me it seems like the people writing that parser are just doing it piecemeal, adding little spot fixes instead of having a consistent and robust stream parser that won’t crash or slow down to molasses if you look at it wrong.
In several places they mention that the parser was never tested with partial buffer fills because replayed packet captures from disk don’t do that. Real network sockets do.
Suddenly I can see how the security bug happened where data was getting received by the wrong SqlConnection under high load.
I bet they have a bunch of concurrency bugs in that spaghetti code…