Comment by JoelJacobson

3 months ago

> The problem with Postgres' NOTIFY is that all notifications go through a single queue!

> Even if you have 20 database connections making 20 transactions in parallel, all of them need to wait for their turn to lock the notification queue, add their notification, and unlock the queue again. This creates a bottleneck especially in high-throughput databases.

We're currently working hard on optimizing LISTEN/NOTIFY: https://www.postgresql.org/message-id/flat/6899c044-4a82-49b...

If you have any experiences of actual workload where you are currently experiencing performance/scalability problems, I would be interested in hearing from you, to better understand the actual workload. In some workloads, you might only listen to a single channel. For such single-channel workloads, the current implementation seems hard to tweak further, given the semantics and in-commit-order guarantees. However, for multi-channel workloads, we could do a lot better, which is what the linked patch is about. The main problem with the current implementation for multi-channel workloads, is that we currently signal and wake all listening backends (a backend is the PostgreSQL processes your client is connected to), even if they are not interested in the specific channels being notified in the current commit. This means that if you have 100 connections open in which each connect client has made a LISTEN on a different channel, then when someone does a NOTIFY on one of those channels, instead of just signaling the backend that listen on that channel, all 100 backends will be signaled. For multi-channel workloads, this could mean an enormous extra cost coming from the context-switching due to the signaling.

I would greatly appreciate if you could please reply to this comment and share your different workloads when you've had problems with LISTEN/NOTIFY, to better understand approximately how many listening backends you had, and how many channels you had, and the mix of volume on such channels. Anything that could help us do better realistic simulations of such workloads, to improve the benchmark tests we're working on. Thank you.

9 comments

JoelJacobson

JoelJacobson 3 months ago

Here is the Commitfest entry if you want to help with reviewing/development/testing of the patch: https://commitfest.postgresql.org/patch/6078/

tobyhinloopen 3 months ago

We use it like this:

    CREATE TRIGGER notify_events_trg AFTER INSERT ON xxx.events FOR EACH ROW EXECUTE PROCEDURE public.notify_events();

    CREATE FUNCTION public.notify_events() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
    BEGIN
      PERFORM pg_notify('events', row_to_json(NEW)::text);
      RETURN NEW;
    END;
    $$;

And then we have a bunch of triggers like this on many tables:

    CREATE TRIGGER create_category_event_trg AFTER INSERT OR DELETE OR UPDATE ON public.categories FOR EACH ROW EXECUTE PROCEDURE public.create_category_event();

    CREATE FUNCTION public.create_category_event() RETURNS trigger
        LANGUAGE plpgsql SECURITY DEFINER
        AS $$
    DECLARE
      category RECORD;
      payload JSONB;
    BEGIN
      category := COALESCE(NEW, OLD);
      payload := jsonb_build_object('id', category.id);
      IF NEW IS NULL OR NEW.deleted_at IS NOT NULL THEN
        payload := jsonb_set(payload, '{deleted}', 'true');
      END IF;
      INSERT INTO xxx.events (channel, inserted_at, payload)
        VALUES ('category', NOW() AT TIME ZONE 'utc', payload);
      RETURN NULL;
    END;
    $$;

We found no notable performance issues. We have a single LISTEN in another application. We did some stress testing and found that it performs way better than we would ever need

JoelJacobson 3 months ago
Thanks for the report. For that use-case (if you have a single application using a single connection with a LISTEN) then it's expected that is should perform well, since then there is only a single backend which will be context-switched to when each NOTIFY signals it.
- oulipo2 3 months ago
  
  Just out of curiosity, could you try to frame in what context this would or would not work? If you have multiple backends with multiple connections for instance? And then if we start with such a "simple" solution and we later need to scale with distributed backends, how should we do this?
  
  2 replies →

rickstanley 3 months ago

We tried to use LISTEN/NOTIFY for notification purposes, or rather, a queue in which the order mattered (well, kind of, we were aiming for the best scenario where a customer would not perform a specific task in a matter of milliseconds, because we could afford this expectation), in a .NET application using Npgsql.

The listeners were scattered between replicas, so we took advantage of Advisory Locks (https://news.ycombinator.com/item?id=44490510, and my first point felt validated;

but to this day I'm still unsure. Anyway, since it was a complementary system, it didn't hurt to leave it out, we had another background job that would process the outbox table regardless, but I felt it could/would give something closer to "real time" in our system.

oulipo2 3 months ago

The post seems to say that NOTIFY is generally not a good idea, then comments here say that NOTIFY can actually work, but it depends on some particular things (which are not easy to know for newcomers to Postgres), makes it a bit complicated to know what is the way to go for a new database

In my case I have an IoT setting, where my devices can change their "DesiredState", and I want to listen on this to push some message to MQTT... but then there might be also other cases where I want to listen to some messages elsewhere (eg do something when there is an alert on a device, or listen to some unrelated object, eg users, etc)

I'm not clear right now what would be the best setting to do this, the tradeoffs, etc

Imagine I have eg 100k to 10M range of devices, that sometimes these are updated in bulks and change their DesiredState 10k at a time, would NOTIFY work in that case? Should I use the WAL/Debezium/etc?

Can you try to "dumb down" in which cases we can use NOTIFY/LISTEN and in which case it's best not to? you're saying something about single-channel/multi-channel/etc but to a newcomer I'm not clear on what all these are

barrell 3 months ago

I have listen/notify on most changes in my database. Not sure I've experienced any performance issue though but I can't say I've been putting things through their paces. IMHO listen/notify's simplicity outweighed the perf gains by WAL.

I'm only sharing this should it be helpful:

  def up do
    whitelist = Enum.join(@user_columns ++ ["tick"], "', '")

    execute """
    CREATE OR REPLACE FUNCTION notify_phrasing() RETURNS trigger AS $$
    DECLARE
      notif jsonb;
      col_name text;
      col_value text;
      uuids jsonb := '{}'::jsonb;
      user_columns text[] := ARRAY['#{whitelist}'];
    BEGIN
      -- First, add all UUID columns
      FOR col_name IN
        SELECT column_name
        FROM information_schema.columns
        WHERE table_name = TG_TABLE_NAME AND data_type = 'uuid'
      LOOP
        EXECUTE format('SELECT ($1).%I::text', col_name)
        INTO col_value
        USING CASE WHEN TG_OP = 'DELETE' THEN OLD ELSE NEW END;

        IF col_value IS NOT NULL THEN
          uuids := uuids || jsonb_build_object(col_name, col_value);
        END IF;
      END LOOP;

      -- Then, add user columns if they exist in the table
      FOREACH col_name IN ARRAY user_columns
      LOOP
        IF EXISTS (
          SELECT 1
          FROM information_schema.columns
          WHERE table_name = TG_TABLE_NAME AND column_name = col_name
        ) THEN
          EXECUTE format('SELECT ($1).%I::text', col_name)
          INTO col_value
          USING CASE WHEN TG_OP = 'DELETE' THEN OLD ELSE NEW END;

          IF col_value IS NOT NULL THEN
            uuids := uuids || jsonb_build_object(col_name, col_value);
          END IF;
        END IF;
      END LOOP;

      notif = jsonb_build_object(
        'table', TG_TABLE_NAME,
        'event', TG_OP,
        'uuids', uuids
      );

      PERFORM pg_notify('phrasing', notif::text);
      RETURN NULL;
    END;
    $$ LANGUAGE plpgsql;
    """

    # Create trigger for each table
    Enum.each(@tables, fn table ->
      execute """
      CREATE TRIGGER notify_phrasing__#{table}
      AFTER INSERT OR UPDATE OR DELETE ON #{table}
      FOR EACH ROW EXECUTE FUNCTION notify_phrasing();
      """
    end)
  end

I do react to most (70%) of my database changes in some way shape or form, and post them to a PubSub topic with the uuids. All of my dispatching can be done off of uuids.