← Back to context

Comment by singron

19 hours ago

PG does reuse plans, but only if you prepare a query and run it more than 5 times on that connection. See plan_cache_mode[0] and the PREPARE docs it links to. This works great on simple queries that run all the time.

It sometimes really stinks on some queries since the generic plan can't "see" the parameter values anymore. E.g. if you have an index on (customer_id, item_id) and run a query where `customer_id = $1 AND item_id = ANY($2)` ($2 is an array parameter), the generic query plan doesn't know how many elements are in the array and can decide to do an elaborate plan like a bitmap index scan instead of a nested loop join. I've seen the generic plan flip-flop in a situation like this and have a >100x load difference.

The plan cache is also per-connection, so you still have to plan a query multiple times. This is another reason why consolidating connections in PG is important.

0: https://www.postgresql.org/docs/current/runtime-config-query...

Yes manual query preparation by client [1] is what you did in MSSQL server up until v7.0 I believe, which was 1998 when it started doing automatic caching based on statement text. I believe it also cached stored procedures before v7.0 which is one reason they were recommended for all application code access to the database back then.

MSSQL server also does parameter sniffing now days and can have multiple plans based on the parameters values it also has a hint to guide or disable sniffing because many times a generic plan is actually better, again something else PG doesn't have, HINTS [2].

PG being process based per connection instead of thread based makes it much more difficult to share plans between connections and it also has no plan serialization ability. Where MSSQL can save plans to xml and they can be loaded on other servers and "frozen" to use that plan if desired, they can also be loaded into plan inspection tools that way as well [3].

1. https://learn.microsoft.com/en-us/sql/relational-databases/n...

2. https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

3. https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

  • PostgreSQL shares other caches between processes so they probably could have a global plan cache if they wanted. I wonder why they don’t though.

    One possible reason is that the planner configuration can be different per connection, so the plans might not transfer

    • > PostgreSQL shares other caches between processes so they probably could have a global plan cache if they wanted. I wonder why they don’t though.

      > One possible reason is that the planner configuration can be different per connection, so the plans might not transfer

      That's part of it, another big part is that the transactional DDL makes it more complicated, as different sessions might require different plans.

    • In MSSQL Server part of the plan match is the various session/connection options, if they are different there are different plans cached.

      I believe the plan data structure PG is intimately tied to process space memory addresses since it was never thought to share between them and can even contain executable code that was generated.

      This makes it difficult to share between processes without a heavy redesign but would be a good change IMO.