← Back to context

Comment by rtpg

1 year ago

I am going to go the other direction on this... to anyone reading this, please consider using a backend-generic queueing system for your Python project.

Why? Mainly because those systems offer good affordances for testing and running locally in an operationally simple way. They also tend to have decent default answers for various futzy questions around disconnects at various parts of the workflow.

We all know Celery is a buggy pain in the butt, but rolling your own job queue likely ends up with you just writing a similary-buggy pain in the butt. We've already done "Celery but simpler", it's stuff like Dramatiq!

If you have backend-specific needs, you won't listen to this advice. But think deeply how important your needs are. Computers are fast, and you can deal with a lot of events with most systems.

Meanwhile if you use a backend-generic system... well you could write a backend using PgQueuer!

In my experience, it's easy to test locally with PG: we have unit tests which re-create DB for each test... It works.

Also DB transactions are absolutely the best way to provide ACID guarantee

> those systems offer good affordances for testing and running locally in an operationally simple way

Define "operationally simple", most if not all of them need persistent anyway, on top of the queue itself. This eliminates the queue and uses a persistent you likely already have.

  • Well for example, lots of queueing libraries have an "eager task" runtime option. What does that do? Instead of putting work into a backend queue, it just immediately runs the task in-process. You don't need any processing queue!

    How many times have you shipped some background task change, only to realize half your test suite doesn't do anything with background tasks, and you're not testing your business logic to the logical conclusion? Eager task execution catches bugs earlier on, and is close enough to the reality for things that matter, while removing the need for, say, multi-process cordination in most tests.

    And you can still test things the "real way" if you need to!

    And to your other point: you can use Dramatiq with Postgres, for example[0]. I've written custom backends that just use pg for these libs, it's usually straightforward because the broker classes tend to abstract the gnarly things.

    [0]: https://pypi.org/project/dramatiq-pg/

  • Some message queue brokers that traditionally implement their own backends can also use Postgresql (and other RDBMSs) for persistence. This is a reasonable option if you a.) want to consolidate persistence backends b.) want a mature, battle proven broker and client stack.

Some names

- Celery (massive and heavy)

- Dramatiq

- APScheduler

- Huey

Today, Redis queues, unless stricly a single process, seem to be most pain free for small scale use.

  • We had a terrible time with Dramatiq; very buggy and resource-heavy. We ended up switching to SNS/SQS combo

    • Ah that's unfortunate, I had a pretty OK time with Dramatiq (coming off of Celery especially). But I imagine this is dependent on your scale.

      I think most people reading this site are working on relatively small systems (that still need background tasks!) and the fixed costs of background tasks can be reasonable. But I could be totally offbase

    • For small scale use cases resource heavy does not matter, but only the ease of use. Amazon proprietary APIs are notoriously hard to work with.