Comment by JoelJacobson
3 months ago
> The problem with Postgres' NOTIFY is that all notifications go through a single queue!
> Even if you have 20 database connections making 20 transactions in parallel, all of them need to wait for their turn to lock the notification queue, add their notification, and unlock the queue again. This creates a bottleneck especially in high-throughput databases.
We're currently working hard on optimizing LISTEN/NOTIFY: https://www.postgresql.org/message-id/flat/6899c044-4a82-49b...
If you have any experiences of actual workload where you are currently experiencing performance/scalability problems, I would be interested in hearing from you, to better understand the actual workload. In some workloads, you might only listen to a single channel. For such single-channel workloads, the current implementation seems hard to tweak further, given the semantics and in-commit-order guarantees. However, for multi-channel workloads, we could do a lot better, which is what the linked patch is about. The main problem with the current implementation for multi-channel workloads, is that we currently signal and wake all listening backends (a backend is the PostgreSQL processes your client is connected to), even if they are not interested in the specific channels being notified in the current commit. This means that if you have 100 connections open in which each connect client has made a LISTEN on a different channel, then when someone does a NOTIFY on one of those channels, instead of just signaling the backend that listen on that channel, all 100 backends will be signaled. For multi-channel workloads, this could mean an enormous extra cost coming from the context-switching due to the signaling.
I would greatly appreciate if you could please reply to this comment and share your different workloads when you've had problems with LISTEN/NOTIFY, to better understand approximately how many listening backends you had, and how many channels you had, and the mix of volume on such channels. Anything that could help us do better realistic simulations of such workloads, to improve the benchmark tests we're working on. Thank you.
Here is the Commitfest entry if you want to help with reviewing/development/testing of the patch: https://commitfest.postgresql.org/patch/6078/
We use it like this:
And then we have a bunch of triggers like this on many tables:
We found no notable performance issues. We have a single LISTEN in another application. We did some stress testing and found that it performs way better than we would ever need
Thanks for the report. For that use-case (if you have a single application using a single connection with a LISTEN) then it's expected that is should perform well, since then there is only a single backend which will be context-switched to when each NOTIFY signals it.
Just out of curiosity, could you try to frame in what context this would or would not work? If you have multiple backends with multiple connections for instance? And then if we start with such a "simple" solution and we later need to scale with distributed backends, how should we do this?
2 replies →
We tried to use LISTEN/NOTIFY for notification purposes, or rather, a queue in which the order mattered (well, kind of, we were aiming for the best scenario where a customer would not perform a specific task in a matter of milliseconds, because we could afford this expectation), in a .NET application using Npgsql.
The listeners were scattered between replicas, so we took advantage of Advisory Locks (https://news.ycombinator.com/item?id=44490510, and my first point felt validated;
but to this day I'm still unsure. Anyway, since it was a complementary system, it didn't hurt to leave it out, we had another background job that would process the outbox table regardless, but I felt it could/would give something closer to "real time" in our system.
The post seems to say that NOTIFY is generally not a good idea, then comments here say that NOTIFY can actually work, but it depends on some particular things (which are not easy to know for newcomers to Postgres), makes it a bit complicated to know what is the way to go for a new database
In my case I have an IoT setting, where my devices can change their "DesiredState", and I want to listen on this to push some message to MQTT... but then there might be also other cases where I want to listen to some messages elsewhere (eg do something when there is an alert on a device, or listen to some unrelated object, eg users, etc)
I'm not clear right now what would be the best setting to do this, the tradeoffs, etc
Imagine I have eg 100k to 10M range of devices, that sometimes these are updated in bulks and change their DesiredState 10k at a time, would NOTIFY work in that case? Should I use the WAL/Debezium/etc?
Can you try to "dumb down" in which cases we can use NOTIFY/LISTEN and in which case it's best not to? you're saying something about single-channel/multi-channel/etc but to a newcomer I'm not clear on what all these are
I have listen/notify on most changes in my database. Not sure I've experienced any performance issue though but I can't say I've been putting things through their paces. IMHO listen/notify's simplicity outweighed the perf gains by WAL.
I'm only sharing this should it be helpful:
I do react to most (70%) of my database changes in some way shape or form, and post them to a PubSub topic with the uuids. All of my dispatching can be done off of uuids.