Comment by FZambia

1 day ago

Client SDKs are often a major challenge in systems like these. In my experience, building SDKs on top of asynchronous protocols is particularly tricky. It's generally much easier to make the server-side part reliable. The complexity arises because SDKs must account for a wide range of usage patterns - and you are not controlling the usage.

Asynchronous protocols frequently result in callback-based or generator-style APIs on the client side, which are hard to implement safely and intuitively. For example, consider building a real-time SDK for something like NATS. Once a message arrives, you need to invoke a user-defined callback to handle it. At that point, you're faced with a design decision: either call the callback synchronously (which risks blocking the socket reading loop), or do it asynchronously (which raises issues like backpressure handling).

Also, SDKs are often developed by different people, each with their own design philosophy and coding style, leading to inconsistency and subtle bugs.

So this isn't only about NATS. Just last week, we ran into two critical bugs in two separate Kafka SDKs at work.