Dealing with partitions in Postgres 11+

Photo by Tobias Adam on Unsplash

One of the main features of PostgreSQL 10 was support for declarative partitioning. And PostgreSQL 11 brought many improvements to it.

But does that solve all the issues, and what other issues you may get from using partitions in your project?

Let’s start with the table we’d like to partition:

CREATE TABLE my_table (
id bigserial not null primary key,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
unique (external_id)
);

We assume that we get tens of millions of objects per day, which are uniquely identified by external_id

Due to the fact that we mostly care about today’s data and we also like to scrap object that are older than a week, we decided to partition by created_at

CREATE TABLE my_table (
id bigserial not null primary key,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
unique (external_id)
) PARTITION BY RANGE (created_at);

Makes sense, right?

Well, not really:

ERROR: insufficient columns in PRIMARY KEY constraint definition
PRIMARY KEY constraint on table "my_table" lacks column "created_at" which is part of the partition key.

Ok, let’s move primary key definition:

CREATE TABLE my_table (
id bigserial not null,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
primary key (id, created_at),
unique (external_id)
) PARTITION BY RANGE (created_at);

Well?

ERROR: insufficient columns in UNIQUE constraint definition
Detail: UNIQUE constraint on table "my_table" lacks column "created_at" which is part of the partition key.

So, both the primary key and unique keys need to include the partition key.

CREATE TABLE my_table (
id bigserial not null,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
primary key (id, created_at),
unique (external_id, created_at)
) PARTITION BY RANGE (created_at);

Let’s try upsert query on that table:

insert into my_table 
(created_at, updated_at, external_id, status)
values
(now(), now(), '03e5e53d-9a5e-4fb3-a62d-c687f17dae74', 1)
on conflict (external_id) do
update set status = 1
returning id;

Not good:

ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification

So, we need to include the partition key in our upserts too:

insert into my_table 
(created_at, updated_at, external_id, status)
values
(now(), now(), '03e5e53d-9a5e-4fb3-a62d-c687f17dae74', 1)
on conflict (external_id, created_at) do
update set status = 1
returning id;

Another error:

ERROR: no partition of relation "my_table" found for row
Detail: Partition key of the failing row contains (created_at) = (some timestamp).

Right, we didn’t create any partition yet. But there’s a quick fix for that since Postgres 11 — defult partitions.

create table my_table_default PARTITION OF my_table DEFAULT;

Now, let’s try inserting object with same external_id twice. We expect to have only one row, with status = 2

insert into my_table 
(created_at, updated_at, external_id, status)
values
(now(), now(), '03e5e53d-9a5e-4fb3-a62d-c687f17dae74', 2)
on conflict (external_id, created_at) do
update set status = 2
returning id;

But we have two:

external_id                          status
-------------------------------------------
03e5e53d-9a5e-4fb3-a62d-c687f17dae74 1
03e5e53d-9a5e-4fb3-a62d-c687f17dae74 2

Reason is quite obvious when we look back at our unique key. It includes created_at , since that’s the partition key. But created_at is usually set automatically, rendering our unique key useless for upserts.

Are we stuck? Not yet.

Let’s go back to our table definition and add a new column, called day:

CREATE TABLE my_table (
id bigserial not null,
day date not null default now(),
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
primary key (id, day),
unique (external_id, day)
) PARTITION BY RANGE (day);

Now, this column, and not created_at, becomes our partition key.

And now we get the expected behaviour from our upserts:

external_id                          status
-------------------------------------------
03e5e53d-9a5e-4fb3-a62d-c687f17dae74 2

I know what you’re thinking. Having this additional column may seem ugly.

So, maybe using expressions as our partition key would help us?

Let’s try it:

CREATE TABLE my_table (
id bigserial not null,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
primary key (id),
unique (external_id)
) PARTITION BY RANGE (date_trunc('day', created_at));

Doesn’t seem so:

ERROR: unsupported PRIMARY KEY constraint with partition key definition
Detail: PRIMARY KEY constraints cannot be used when partition keys include expressions.

Ok, maybe if we just get rid of the primary key it will work fine?

ERROR: unsupported UNIQUE constraint with partition key definition
Detail: UNIQUE constraints cannot be used when partition keys include expressions.

Maybe if we can get rid of the primary key it would work?

CREATE TABLE my_table (
id bigserial not null,
created_at timestamp not null,
updated_at timestamp not null,
external_id uuid not null,
status integer not null,
unique (external_id)
) PARTITION BY RANGE (date_trunc('day', created_at));

Nope:

ERROR: unsupported UNIQUE constraint with partition key definition
Detail: UNIQUE constraints cannot be used when partition keys include expressions.

So, you can use expression as your partition key, but then you must forfeit unique keys.

And in some cases, it may be a valid choice.

Consider, for example, a situation where we never mutate our data, and use only inserts. It becomes then the application decision which row is the correct one. You may decide it’s the row with the greatest created_at , or with greatest status , or probably you’ll have some even more complicated business logic to resolve that.

The last, rather trivial part that’s left for us to do is to create the partitions themselves.

We can create hundred of them ahead of time:

Let’s take a look at the execution plan now:

explain analyse select *
from my_table
where external_id = '03e5e53d-9a5e-4fb3-a62d-c687f17dae74'
and day between now() - interval '1 day' and now() + '1 day';

Note that I decided to select ±1 days

Append  (cost=0.16..4922.19 rows=601 width=48) (actual time=0.033..0.037 rows=1 loops=1)
Subplans Removed: 99
-> Index Scan using my_table_2019_12_20_external_id_day_key on my_table_2019_12_20 (cost=0.16..8.18 rows=1 width=48) (actual time=0.033..0.033 rows=1 loops=1)
Index Cond: ((external_id = '03e5e53d-9a5e-4fb3-a62d-c687f17dae74'::uuid) AND (day >= (now() - '1 day'::interval)) AND (day <= (now() + '1 day'::interval)))
-> Index Scan using my_table_2019_12_21_external_id_day_key on my_table_2019_12_21 (cost=0.16..8.18 rows=1 width=48) (actual time=0.003..0.003 rows=0 loops=1)
Index Cond: ((external_id = '03e5e53d-9a5e-4fb3-a62d-c687f17dae74'::uuid) AND (day >= (now() - '1 day'::interval)) AND (day <= (now() + '1 day'::interval)))

As you can see, PostgreSQL successfully prunes the partitions, scanning only the relevant ones, then uses the unique index we created on the parent table to fetch the particular row.

Conclusions

You should make good use of partitions in PostgreSQL.

But consider carefully the requirements:

  • Both the primary key and all unique keys must include partition key in their definition. Make sure that the unique key still makes sense
  • Partitions are not created automatically, so make sure you have a default partition
  • Expressions can be used as partition key, but then you cannot use neither primary key nor unique keys

Solutions Architect @Depop, author of “Hands-on Design Patterns with Kotlin” book and “Web Development with Kotlin” course

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store