Introduction to PostgreSQL Indexes

15 days ago (dlt.github.io)

18 comments

dlt

cdiamand 15 days ago

Linking to the postgresql docs since they are very well written and surprisingly enjoyable to read.

https://www.postgresql.org/docs/current/indexes-intro.html

brudgers 15 days ago

Related, Use the Index Luke

https://use-the-index-luke.com/

jihadjihad 15 days ago

The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions? I had an index and query similar to the third example, and IIRC PG was able to use an index, though I believe it was a bitmap index scan.

I am also unsure of the specific perf tradeoffs between index scan types in that case, but when I saw that happen in the EXPLAIN plan it was enough for me to call into question what had been hardcoded wisdom in my mind for quite some time.

Further essential reading is the classic Use The Index, Luke [0] site, and the book is a great buy for the whole team.

0: https://use-the-index-luke.com/

petergeoghegan 15 days ago
> The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions?
No, it isn't. PostgreSQL 18 added support for index skip scan:
https://youtu.be/RTXeA5svapg?si=_6q3mj1sJL8oLEWC&t=1366
It's actually possible to use a multicolumn index with a query that only has operators on its lower-order columns in earlier versions. But that requires a full index scan, which is usually very inefficient.
- dlt 15 days ago
  
  Hi Peter, author here. Thanks for weighing in with the extra context on index skip scan, and huge thanks for adding this to Postgres.
  I’m going to revise the multi-column index section to be more precise about when leftmost-prefix rules apply, and I’ll include a note on how skip scan changes the picture
glenjamin 15 days ago
A bitmap index scan allows the database to narrow down which pages could include the data, but then still has to recheck the condition on the contents of those pages - so will still not be as performant as an proper index scan
- isbvhodnvemrwvn 14 days ago
  
  With postgres indexes not containing liveness data for tuples you'll have to hit quite a lot of those pages anyway, unless they are frozen.

zozbot234 15 days ago

It would be nice to see out-of-the-box support in PostgreSQL for what's known as incremental view maintenance. It's very much an index in that it gets updated automatically when the underlying data changes, but it supports that for arbitrary views - not just special-cased like ordinary database indexes.

BenoitP 15 days ago

A hard problem, especially wrt to transactions on a moving target.
From memory, handful of projects just dedicated to this dimension of databases: Noria, Materialize, Apache Flink, GCP's Continuous Queries, Apache Spark Streaming Tables, Delta Tables, ClickHouse streaming tables, TimescaleDB, ksqlDB, StreamSQL; and dozens more probably. IIRC, since this is about postgres, there is recently created extension trying to deal with this: pg_ivm
lispisok 14 days ago

If you have timeseries data TimescaleDB has this with continuous aggregates

turbocon 15 days ago

This looks really awesome for Postgres

For general B Tree index resources this has been my got to site for years https://use-the-index-luke.com/

augusteo 14 days ago

Good timing for this article. The multi-column index advice was always confusing because the "leading column" rules had real performance implications, but bitmap index scans made it less catastrophic than the textbooks suggested.

Skip scan in PG 18 changes a lot of that conventional wisdom. Worth updating the mental model for anyone who learned indexing on older versions.

morshu9001 14 days ago

The whole btree vs hash discussion is interesting. Many people assume "ID" columns should be hash, but iirc the default btree works best for those. Also treelike structures are fundamentally better for nearly-sequential value insertion.

The blog post that this links to comes to the opposite conclusion though, showing hash winning the benchmarks.

joaomsa 15 days ago

Essential reading. More in-depth than an introduction, but without being overly impenetrable except to those dealing with the internals.

zmmmmm 14 days ago

I love this style of writing. Simple, humble and direct transfer of knowledge.

Anonyneko 14 days ago

Is there a use-the-index-luke for MongoDB...?