Comment by yawboakye

1 month ago

the problem i see here is that we end up treating the background job/task processor as part of the production system (e.g. the server that responds to requests, in the case of a web application) instead of a separate standalone thing. rails doesn’t make this distinction clear enough. it’s okay to back your tasks processor with a pg database (e.g. river[0]) but, as you indirectly pointed out, it shouldn’t be the same as the production database. this is why redis was preferred anyways: it was a lightweight database for the task processor to store state, etc. there’s still great arguments in favor of this setup. from what i’ve seen so far, solidqueue doesn’t make this separation.

[0]: https://riverqueue.com/

18 comments

yawboakye

runako 1 month ago

SolidQueue uses its own db configuration.

> it shouldn’t be the same as the production database

This is highly dependent on the application (scale, usage, phase of lifecycle, etc.)

bgentry 1 month ago
Yeah, River generally recommends this pattern as well (River co-author here :)
To get the benefits of transactional enqueueing you generally need to commit the jobs transactionally with other database changes. https://riverqueue.com/docs/transactional-enqueueing
It does not scale forever, and as you grow in throughput and job table size you will probably need to do some tuning to keep things running smoothly. But after the amount of time I've spent in my career tracking down those numerous distributed systems issues arising from a non-transactional queue, I've come to believe this model is the right starting point for the vast majority of applications. That's especially true given how high the performance ceiling is on newer / more modern job queues and hardware relative to where things were 10+ years ago.
If you are lucky enough to grow into the range of many thousands of jobs per second then you can start thinking about putting in all that extra work to build a robust multi-datastore queueing system, or even just move specific high-volume jobs into a dedicated system. Most apps will never hit this point, but if you do you'll have deferred a ton of complexity and pain until it's truly justified.
- yawboakye 1 month ago
  
  state machines to the rescue, ie i think the nature of asynchronous processing requires that we design for good/safe intermediate states.

andrewstuart 1 month ago

It’s not necessary to separate queue db from application db.

yawboakye 1 month ago
got it. is it necessary, then, to couple queue db with app db? if answer is no then we can’t make a necessity argument here, unfortunately.
- nick__m 1 month ago
  
  Frequently you have to couple the transactional state of the queue db and the app db, colocating them is the simplest way to achieve that without resorting to distributed transactions or patterns that involve orchestrated compensation actions.
  
  3 replies →
- jrochkind1 1 month ago
  
  solid_queue by default prefers you use a different db than app db, and will generate that out of the box (also by default with sqlite3, which, separate discussion) but makes it possible, and fairly smooth, to configure to use the same db.
  Personally, I prefer the same db unless I were at a traffic scale where splitting them is necessary for load.
  One advantage of same db is you can use db transaction control over enqueing jobs and app logic too, when they are dependent. But that's not the main advantage to me, I don't actually need that. I just prefer the simplicity, and as someone else said above, prefer not having to reconcile app db state with queue state if they are separate and only ONE goes down. Fewer moving parts are better in the apps I work on which are relatively small-scale, often "enterprise", etc.

erispoe 1 month ago

> it shouldn’t be the same as the production database

Why is that?

gregors 1 month ago
Here's an example from the circleci incident
https://status.circleci.com/incidents/hr0mm9xmm3x6
and a good analysis by a flicker engineer who ran into similar issues
https://blog.mihasya.com/2015/07/19/thoughts-evoked-by-circl...
- davidw 1 month ago
  
  CircleCI and Flickr are both pretty big systems. There are tons of businesses that will never operate at that scale.
  
  1 reply →
zarzavat 1 month ago
If you need to restore the production database do you also want to restore the task database?
If your task is to send an email, do you want to send it again? Probably not.
- stavros 1 month ago
  
  It's not like I'll get a choice between the task database going down and not going down. If my task database goes down, I'm either losing jobs or duplicating jobs, and I have to pick which one I want. Whether the downtime is at the same time as the production database or not is irrelevant.
  In fact, I'd rather it did happen at the same time as production, so I don't have to reconcile a bunch of data on top of the tasks.
  
  2 replies →