← Back to context

Comment by cogman10

7 months ago

> streaming system will process n messages in O (n log n)

I'm guessing this is mostly around how backed up the stream is. n isn't the total number of messages but rather the current number of unacked messages.

Would a radix structure work better here? If you throw something like a UUID7 on the messages and store them in a radix structure you should be able to get O(n) performance here correct? Or am I not understanding the problem well.

I think the problem is that if you want quick access to all messages with a particular key then you have to maintain some kind of index over all persisted messages. So n would be total number of persisted messages as I read it, which can be quite large. But even storing them in the first place is O(n), so O(n log n) might not be so bad.

  • That's correct. And keep in mind that you might have new consumers starting from the beginning come into play, so you have to permanently retain the indexes.

    And yes, O(n log n ) is not bad at all. Sorted database indexes (whether SQL, NoSQL, or AcmeVendorSQL, etc.) already take O(n log n) to insert n elements into data storage or to read n elements from data storage.