Comment by kerblang

2 months ago

Kafka is really not intended to improve on this. Instead, it's intended for very high-volume ETL processing, where a classical message queue delivering records would spend too much time on locking. Kafka is hot-rodding the message queue design and removing guard rails to get more messages thru faster.

Generally I say, "Message queues are for tasks, Kafka is for data." But in the latter case, if your data volume is not huge, a message queue for async ETL will do just fine and give better guarantees as FIFO goes.

In essence, Kafka is a very specialized version of much more general-purpose message queues, which should be your default starting point. It's similar to replacing a SQL RDBMS with some kind of special NoSQL system - if you need it, okay, but otherwise the general-purpose default is usually the better option.

2 comments

kerblang

CharlieDigital 2 months ago

Of course this is not the same as Kafka, but the comment I'm replying to:

    > Ah yes, and every consumer should just do this in a while (true) loop as producers write to it. Very efficient and simple with no possibility of lock contention or hot spots. Genius, really.

Seemed to imply that it's not possible to build a high performance pub/sub system using a simple SQL select. I do not think that is true and it is in fact fairly easy to build a high performance pub/sub system with a simple SQL select. Clearly, this design as proposed is not the same as Kafka.

alexjplant 2 months ago

No, I implied that implementing pub/sub with just a select statement is silly because it is. Your implementation accounts for the downfalls of this approach with smart design using a message queue and intelligent locking semantics. Parent of my comment was glib and included none of this.