A few weeks ago, an article on Cybertec was published that warned readers to be aware of the SELECT performance impact of using partitions in PostgreSQL.

Having done a bit of work with partitions to build capability for Bymetric, something about the article irked me. There’s no arguging that it highlights a performance degredation when querying for a indexed value in a single table vs. a partitioned table, but it seems so far from practical use cases.

I accept it’s an academic point made to illustrate performance, but it runs the risk of turning people away from partitioning.

In the article, the poorly performing partioned table is set up such that:

  • The partitions are segmented by a modulus/remainder hash function on the serial ID column
  • The val column is a randomly generated integer that gets indexed (this is what gets queried)
  • The partitioning strategy is purely to split data volumes equally, not based on a practical use case

There is no WHERE query to help the query planner find the relevant partition, and because the partition strategy is over the id column versus the val column it’s got no chance.

Theer is a common need to quickly lookup a single row using an identifier