Comment by airocker

1 year ago

Thats exactly what we do but taking a lock takes 1 RTT to the database which means about 100ms. it limits the number of events receivers can handle. IF you have too many events, receivers will be just trying to take a lock most of the time.

Of my head, you could attach a uuid or sequence number to the emitted event. Then based on the uuid or sequence you can let one or the other event consumer pick?

Ex. you have two consumers, if the sequence number is odd, A picks it, if its even B picks.

  • Great observation, I wrote a little wrongly. What we want ideally is guaranteed delivery to one random free worker. Uuid strategy is better than locking but this could mean that if one worker gets a longer job, all the others on this worker are delayed even if another worker is free.

  • I think this concept you mentioned would be called sharding? It's kinda required for apache Kafka and others.

  • Without more logic than this does this mean any workers that are busy or down mean you lose jobs?

    • We use a garbage collector to error restart if a job is not served within a specified amount of time.