Comment by paffdragon

1 year ago

Indeed, that's my experience too. We used partitions like others mentioned below, but Postgres had issues with moving rows across tables atomically and had to implement our custom complex queries to overcome it. Plus job expiration was dynamic and had to use background cleaning. The bigger problem was with the planner not able to pick up sudden changes in volume and had to use a cron to run analyze on it. Managing retries with backoffs, etc.. At some point we stopped fighting it and just moved to SQS, we have zero problems since, no maintenence needed, and it's still free so we saved storage cost, time and developer effort for ongoing maintenance.

We still use Postgres for simple queues, but those don't really require a library as it's quite simple usually, with some advisory locks we can handle the crashed job unlocking fairly well too.