Comment by samwillis

1 year ago

This looks like a great task queue, I'm a massive proponent of "Postgres is all you need" [0] and doubling down on it with my project that takes it to the extreme.

What I would love is a Postgres task queue that does multi-step pipelines, with fan out and accumulation. In my view a structured relational database is a particularly good backend for that as it inherently can model the structure. Is that something you have considered exploring?

The one thing with listen/notify that I find lacking is the max payload size of 8k, it somewhat limits its capability without having to start saving stuff to tables. What I would really like is a streaming table, with a schema and all the rich type support... maybe one day.

0: https://www.amazingcto.com/postgres-for-everything/

Putting low throughput queues in the same DB is great both for simplicity and for getting exactly-once-processing.

Putting high throughput queues in Postgres sucks because...

No O(1) guarantee to get latest job. Query planner can go haywire.

High update tables bloat like crazy. Needs a whole new storage engine aka ZHEAP

Write amplification as every update has to update every index

LISTEN/NOTIFY doesn't work through connection pooling

  • Update-related throughput and index problems are only a problem if you update tables. You can use an append-only structure to mitigate some of that: insert new entries with the updated statuses instead. You gain the benefit of history also. You can even coax the index into holding non-key values for speed with INCLUDE to CREATE INDEX.

    You can then delete the older rows when needed or as required.

    Query planner issues are a general problem in postgres and is not unique to this problem. Not sure what O(1) means in this context. I am not sure pg has ever been able to promise constant-time access to anything; indeed, with an index, it'd never be asymptotically upper bounded as constant time at all?

    • By the time you need append-only job statuses it's better to move to a dedicated queue. Append-only statuses help but they also make the polling query a lot more expensive.

      Deleting older rows is a nightmare at scale. It leaves holes in the earlier parts of the table and nerfs half the advantage of using append-only in the first place. You end up paying 8kb page IO costs for a single job.

      Dedicated queues have constant time operations for enqueue and dequeue which don't blow up at random times.

      4 replies →

  • Indeed, that's my experience too. We used partitions like others mentioned below, but Postgres had issues with moving rows across tables atomically and had to implement our custom complex queries to overcome it. Plus job expiration was dynamic and had to use background cleaning. The bigger problem was with the planner not able to pick up sudden changes in volume and had to use a cron to run analyze on it. Managing retries with backoffs, etc.. At some point we stopped fighting it and just moved to SQS, we have zero problems since, no maintenence needed, and it's still free so we saved storage cost, time and developer effort for ongoing maintenance.

    We still use Postgres for simple queues, but those don't really require a library as it's quite simple usually, with some advisory locks we can handle the crashed job unlocking fairly well too.

  • > LISTEN/NOTIFY doesn't work through connection pooling

    What's the problem with using it with connection pooling?

Thanks for your insights!

Regarding multi-step pipelines and fan-out capabilities: It's a great suggestion, and while PgQueuer doesn't currently support this, it's something I'm considering for future updates.

As for the LISTEN/NOTIFY payload limit, PgQueuer uses these signals just to indicate changes in the queue table, avoiding the size constraint by not transmitting substantial data through this channel.

I've come to hate "Postgres is all you need." Or at least, "a single Postgres database is all you need."

  • I won´t call it "hate", but I've ran into quite some situations where the Postgres version caused a lot of pain.

    - When it wasn't as easy as a dedicated solution: where installing and managing a focused service is overall easier than shoehorning it into PG.

    - when it didn't perform anywhere close to a dedicated solution: overhead from the guarantees that PG makes (acid and all that) when you don't need them. Or where the relational architecture isn't suited for this type of data: e.g. hierarchical, time-series, etc.

    - when it's not as feature complete as a dedicated service: for example I am quite sure one can build (parts of) an ActiveDirectory or Kafka Bus, entirely in PG. But it will lack features that in future you'll likely need - they are built into these dedicated solutions because they are often needed after all.

  • Why?

    • Just to share an anecdote, we've been able to get to market faster than ever before by just using Postgres (Supabase) for basically everything. We're leveraging RLS, and it's saved us from building an API (just using PostgREST). We've knocked months off our project timeline by doing this.

Is multi-step (fan out, etc) typically something a queue or message bus would handle?

I’ve always handled this with an orchestrator solution like (think Airflow and similar).

Or is this a matter of use case? Like for a real-time scenario where you need a series of things to happen (user registration, etc) maybe a queue handling this makes sense? Whereas with longer running tasks (ETL pipelines, etc) the orchestrator is beneficial?