Comment by antirez
13 hours ago
Every time some production environment can be simplified, it is good news in my opinion. The ideal situation with Rails would be if there is a simple way to switch back to Redis, so that you can start simple, and as soon as you hit some fundamental issue with using SolidQueue (mostly scalability, I guess, in environments where the queue is truly stressed -- and you don't want to have a Postgres scalability problem because of your queue), you have a simple upgrade path. But I bet a lot of Rails apps don't have high volumes, and having to maintain two systems can be just more complexity.
> The ideal situation with Rails would be if there is a simple way to switch back to Redis
That's largely the case.
Rails provide an abstracted API for jobs (Active Job). Of course some application do depend on queue implementation specific features, but for the general case, you just need to update your config to switch over (and of course handle draining the old queue).
Since you're here - https://redis.io/docs/latest/operate/oss_and_stack/managemen...
In AOF mode does Redis write all changes to a WAL ? Is this paired with periodic snapshotting to prevent the log from growing too large ? Does this work in distributed mode or is this single node thing ?
The primary pain point I see here is if devs lean into transactions such that their job is only created together with the everything else that happened.
Losing that guarantee can make the eventual migration harder, even if that migration is to a different postgres instance than the primary db.
That's also something Rails helps abstract away by automatically deferring enqueues to after the transaction completed.
Even SolidQueue behave that way by default.
https://github.com/rails/rails/pull/51426
You can look at it both ways.
Using the database as a queue, you no longer need to setup transaction triggers to fire your tasks, you can have atomic guarantees that the data and the task were created successfully, or nothing was created.
the problem i see here is that we end up treating the background job/task processor as part of the production system (e.g. the server that responds to requests, in the case of a web application) instead of a separate standalone thing. rails doesn’t make this distinction clear enough. it’s okay to back your tasks processor with a pg database (e.g. river[0]) but, as you indirectly pointed out, it shouldn’t be the same as the production database. this is why redis was preferred anyways: it was a lightweight database for the task processor to store state, etc. there’s still great arguments in favor of this setup. from what i’ve seen so far, solidqueue doesn’t make this separation.
[0]: https://riverqueue.com/
SolidQueue uses its own db configuration.
> it shouldn’t be the same as the production database
This is highly dependent on the application (scale, usage, phase of lifecycle, etc.)
Yeah, River generally recommends this pattern as well (River co-author here :)
To get the benefits of transactional enqueueing you generally need to commit the jobs transactionally with other database changes. https://riverqueue.com/docs/transactional-enqueueing
It does not scale forever, and as you grow in throughput and job table size you will probably need to do some tuning to keep things running smoothly. But after the amount of time I've spent in my career tracking down those numerous distributed systems issues arising from a non-transactional queue, I've come to believe this model is the right starting point for the vast majority of applications. That's especially true given how high the performance ceiling is on newer / more modern job queues and hardware relative to where things were 10+ years ago.
If you are lucky enough to grow into the range of many thousands of jobs per second then you can start thinking about putting in all that extra work to build a robust multi-datastore queueing system, or even just move specific high-volume jobs into a dedicated system. Most apps will never hit this point, but if you do you'll have deferred a ton of complexity and pain until it's truly justified.
1 reply →
> it shouldn’t be the same as the production database
Why is that?
Here's an example from the circleci incident
https://status.circleci.com/incidents/hr0mm9xmm3x6
and a good analysis by a flicker engineer who ran into similar issues
https://blog.mihasya.com/2015/07/19/thoughts-evoked-by-circl...
2 replies →
If you need to restore the production database do you also want to restore the task database?
If your task is to send an email, do you want to send it again? Probably not.
1 reply →
It’s not necessary to separate queue db from application db.
got it. is it necessary, then, to couple queue db with app db? if answer is no then we can’t make a necessity argument here, unfortunately.
3 replies →