How to enforce unique constraints on a hypertable

We’ve had a case where a customer encountered an issue where some of their dates have different millisecond precision, resulting in them inserting duplicate records into their database.Example

Example:

		id						event_date
12345678910abcdefghij		2022-04-12 00:36:03.96
12345678910abcdefghij		2022-04-12 00:36:03.959

In this, the duplicate should have been rejected based on the id, but because the timestamp had a rounding issue, a duplicate record was inserted. Given that on a hypertable, unique constraints must be compounded with the event_date, how can we get around an issue like this where event_date might be slightly different?

Note that UNIQUE and PRIMARY constraints must include the partitioning key.

For testing:

tsdb=> CREATE TABLE "public"."events" (
tsdb(>     "id" text NOT NULL,
tsdb(>     "timestamp" timestamp NOT NULL,
tsdb(>     "event_date" timestamp NOT NULL,
tsdb(>     "event_id" numeric,
tsdb(>     "created_at" timestamptz(3) DEFAULT CURRENT_TIMESTAMP(3)
tsdb(> );
CREATE TABLE
tsdb=> -- Turn into hypertable
tsdb=> SELECT create_hypertable('events','event_date');
  create_hypertable
----------------------
 (21,public,events,t)
(1 row)

If you try to insert duplicate data, you still can:

tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.96', '2022-04-12 00:36:03.96', '95');
INSERT 0 1
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.959', '2022-04-12 00:36:03.959', '96');
INSERT 0 1
tsdb=>

A way to get around this is to create a composite unique index over (id, event_date). With regards to the milliseconds issue, we can change the data type of the event_date to store seconds only (for example : timestamp(0) ) or may be trim the milliseconds using a trigger.

tsdb=> CREATE TABLE "public"."events" (
tsdb(>     "id" text NOT NULL,
tsdb(>     "timestamp" timestamp NOT NULL,
tsdb(>     "event_date" timestamp(0) NOT NULL,
tsdb(>     "event_id" numeric,
tsdb(>     "created_at" timestamptz(3) DEFAULT CURRENT_TIMESTAMP(3)
tsdb(> );
CREATE TABLE
tsdb=> -- Turn into hypertable
tsdb=> SELECT create_hypertable('events','event_date');
  create_hypertable
----------------------
 (24,public,events,t)
(1 row)

Now if you try to insert duplicate records, an ERROR will be thrown:

tsdb=> CREATE UNIQUE INDEX idxtest ON events ("id", "event_date");
CREATE INDEX
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.96', '2022-04-12 00:36:03.96', '95');
INSERT 0 1
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.959', '2022-04-12 00:36:03.959', '96');
ERROR:  duplicate key value violates unique constraint "_hyper_24_13_chunk_idxtest"
DETAIL:  Key (id, event_date)=(12345678910abcdefghij, 2022-04-12 00:36:04) already exists.
tsdb=>
2 Likes