Comment by branko_d

1 month ago

Why use string as status, instead of a boolean? That just wastes space for no discernable benefit, especially since the status is indexed. Also, consider turning event_type into an integer if possible, for similar reasons.

Furthermore, why have two indexes with the same leading field (status)?

Boolean is rarely enough for real production workloads. You need a 'processing' state to handle visibility timeouts and prevent double-execution, especially if tasks take more than a few milliseconds. I also find it crucial to distinguish between 'retrying' for transient errors and 'failed' for dead letters. Saving a few bytes on the index isn't worth losing that observability.

  • > Boolean is rarely enough for real production workloads. You need a 'processing' ... 'retrying'... 'failed' ...

    If you have more than 2 states, then just use integer instead or boolean.

    > Saving a few bytes on the index isn't worth losing that observability.

    Not sure why having a few well-known string values is more "observable" than having a few well-known integer values.

    Also, it might be worth having better write performance. When PostgreSQL updates a row, it actually creates a new physical row version (for MVCC), so the less it has to copy the better.

    • Postgres supports enum that would fit this use case well. You get the readability of text and the storage efficiency of an integer. Adding new values used to require a bit of work, but version 9.1 introduced support for it.

Postgres does index de-duplication. So it's likely that even if you change the strings to enums, the index won't be that much smaller.

> Furthermore, why have two indexes with the same leading field (status)?

That indeed is a valid question.