Comment by ilitirit
1 day ago
> The structured data gets written to our Postgres database by tens of thousands of simultaneous writers. Each of these writers is a “meeting bot”, which joins a video call and captures the data in real-time.
Maybe I missed it in some folded up embedded content, or some graph (or maybe I'm probably just blind...), but is it mentioned at which point they started running into issues? The quoted bit about "10s of thousands of simultaneous writers" is all I can find.
What is the qualitative and quantitative nature of relevant workloads? Depending on the answers, some people may not care.
I asked ChatGPT to research it and this is the executive summary:
For PostgreSQL’s LISTEN/NOTIFY, a realistic safe throughput is:
Up to ~100–500 notifications/sec: Handles well on most systems with minimal tuning. Low risk of contention.
~500–2,000 notifications/sec: Reasonable with good tuning (short transactions, fast listeners, few concurrent writers). May start to see lock contention.
~2,000–5,000 notifications/sec: Pushing the upper bounds. Requires careful batching, dedicated listeners, possibly separate Postgres instances for pub/sub.
>5,000 notifications/sec: Not recommended for sustained load. You’ll likely hit serialization bottlenecks due to the global commit lock held during NOTIFY.
[flagged]
What is wrong with you? Why would you even bother posting a comment like this?
Maybe you also don't know what ChatGPT Research is (the Enterprise version, if you really need to know), or what Executive Summary implies, but here's a snippet of the 28 sources used:
https://imgur.com/a/eMdkjAh
In that snippet are links to Postgres docs and two blog posts, one being the blog post under discussion. None of those contain the information needed to make the presented claims about throughput.
To make those claims it's necessary to know what work is being done while the lock is held. This includes a bunch of various resource cleanup, which should be cheap, and RecordTransactionCommit() which will grab a lock to insert a WAL record, wait for it to get flushed to disk and potentially also for it to get acknowledged by a synchronous replica. So the expected throughput is somewhere between hundreds and tens of thousands of notifies per second. But as far as I can tell this conclusion is only available from PostgreSQL source code and some assumptions about typical storage and network performance.
1 reply →