Comment by redskyluan
1 year ago
there seems to be a big hype to adapt pg into any infra. I love PG but this seems not be right thing.
1 year ago
there seems to be a big hype to adapt pg into any infra. I love PG but this seems not be right thing.
I think for me the problem with every single new PG queue is that it seems like everyone and their mother thinks they need to reinvent this specific wheel for some reason and the flavor of the day doesn’t often bring much new to the space. Probably because it's
1. Pretty easy to understand and grok the problem space
2. Scratching the programmer itch of wanting something super generic that you can reuse all over the place
3. Doable with a modest effort over a reasonable scope of time
4. Built on rock solid internals (Postgres) with specific guarantees that you can lean on
Here's 7 of them just right quick:
- https://github.com/timgit/pg-boss
- https://github.com/queueclassic/queue_classic
- https://github.com/florentx/pgqueue
- https://github.com/mbreit/pg_jobs
- https://github.com/graphile/worker
- https://github.com/pgq/pgq
- https://github.com/que-rb/que
Probably could easily find more by searching, I only spent about 5 minutes looking and grabbing the first ones I found.
I'm all for doing this kind of thing as an academic exercise, because it's a great way to learn about this problem space. But at this point if you're reinventing the Postgres job queue wheel and sharing it to this technical audience you need to probably also include why your wheel is particularly interesting if you want to grab my attention.
At low-medium scale, this will be fine. Even at higher scale, so long as you monitor autovacuum performance on the queue table.
At some point it may become practical to bring a dedicated queue system into the stack, sure, but this can massively simplify things when you don’t need or want the additional complexity.
Aside from that, the main advantage of this is transactions. I can do:
And it's guaranteed that both the row and job for Elasticsearch update are inserted.
If you use a dedicated queue system them this becomes a lot more tricky:
There are of course also situations where this doesn't apply, but this "insert row(s) in SQL and then queue job to do more with that" is a fairly common use case for queues, and in those cases this is a great choice.
Transactional Outbox solves this. You use a table like in the first example but instead of actually doing the ElasticSearch update the Outbox table is piped into the dedicated queue.
Most of these two phase problems can be solved by having separate queue consumers.
And as far as I can tell, this is only a perk when your two actions are mutate the collocated database and do X. For all other situations this seems like a downgrade.
1 reply →
I agree, there is no need for FANG level infrastructure. Imo. in most cases, the simplicity / performance tradeoff for small/medium is worth it. There is also a statistics tooling that helps you monitor throughput and failure rats (aggregated on a per second basis)
Instead of SQS, I recently created a basic abstraction on PG that mimics the SQS apis. The intention was to use it during development and we would simply switch to SQS later.
Never did. The production code still uses PG based queue (which has been improved since) and pg just works perfectly fine. Might still need to go with a dedicated queue service at some point but it has been perfectly fine so far.
I use it as a job queue. Yes, it has it's cons, but not dealing with another moving piece in the big picture is totally worth it.
I mean I love postgres like the next guy. And I like simple solutions as long as they work. I just wonder if this is truly simpler than using a redis or rabbitmq queue if you need Queues. If you're already using a cloud provider sqs is quite trivial as well.
I guess if you already have postgres and don't want to use the cloud provider's solution. You can use this to avoid hosting another piece of infra.
db-based gives you the ability to query against your queues, if you use case needs it. Other options tend to dispose the state once the job is finished.