Comment by skywhopper
1 month ago
Redis is fundamentally the wrong storage system for a job queue when you have an RDBMS handy. This is not new information. You still might want to split the job queue onto its own DB server when things start getting busy, though.
For caching, though, I wouldn’t drop Redis so fast. As a in-memory cache, the ops overhead of running Redis is a lot lower. You can even ignore HA for most use cases.
Source: I helped design and run a multi-tiered Redis caching architecture for a Rails-based SaaS serving millions of daily users, coordinating shared data across hundreds of database clusters and thousands of app servers across a dozen AWS regions, with separate per-host, per-cluster, per-region, and global cache layers.
We used Postgres for the job queues, though. Entirely separate from the primary app DBs.
> Redis is fundamentally the wrong storage system for a job queue when you have an RDBMS handy
One could go one step further and say an RDBMS is fundamentally the wrong storage system for a job queue when you have a persistent, purpose-built message queue handy.
Honestly, for most people, I'd recommend they just use their cloud provider's native message queue offering. On AWS, SQS is cheap, reliable, easy to start with, and gives you plenty of room to grow. GCP PubSub and Azure Storage Queues are probably similar in these regards.
Unless managing queues is your business, I wouldn't make it your problem. Hand that undifferentiated heavy lifting off.
Rails shops seem to not like to use SQS/PubSub/Kafka/RabbitMQ for some reason. They seem to really like these worker tasks like SideKiq or SolidQueue. When I compare this with Java, C# or Python who all seem much more likely to use a separate message queue then have that handle the job queue.
Rails shops running on normal CRuby, have difficult in effectively scaling out multithreading due to the GVL lock. It's much easier to "scale" ruby using forking with sidekiq or multi process, and to have it consume data from a Redis list. It is possible to get around the GVL using JRuby, but that poses a different set of constraints and issues.
There is some definite blending of async messaging in the Ruby world though. I've seen connectors which take protobufs on a kafka topic and use sidekiq to fan out the work. With Redis (looking at sidekiq specifically) it becomes trivial to maintain the "current" working set with items popped out of the queue, with atomic commands like BLMOVE (formerly BRPOPLPUSH).
Kafka is taking an interesting turn however with the KIP-932 "Queues for Kafka" initiative. I personally believe it could eat RabbitMQ's lunch if done effectively. Allowing for multiple consumers, a "working set" of unack'ed data, without having to worry as much about the topic partition count.
1 reply →
I've also noticed that they conflate the notion of workers, queues, and message busses. A worker handles asynchronous tasks, but the means by which they communicate might be best served by either a queue or a message bus, depending on the specific needs. Tight coupling might be good for knocking out PoCs quickly, but once you have production-grade needs, the model begins to show its weaknesses.