Database Indexes in PostgreSQL and Timescale: Your Questions, Answered
Database indexes are a big topic in the PostgreSQL landscape. Timescale provides impressive time-series optimizations on top of PostgreSQL: you get performance at scale, cost-efficiency, and all the time-series features you need without toiling through yet another learning curve. Timescale is just PostgreSQL—with time-series superpowers.
Still, we often get questions from customers on how indexes work with Timescale. The short answer to those questions is: database indexes in Timescale work the same as in regular PostgreSQL. But perhaps, you want to dig a bit deeper.
In this blog post, we’ll walk you through the most common questions we get on this topic:
- Does a normal PostgreSQL
INDEXwork with Timescale?
- Will creating a Timescale hypertable use the existing PostgreSQL indexes?
- After creating the Timescale hypertable, do I use the standard
CREATE INDEXmethod for creating new indexes?
- Does the PostgreSQL Planner use the indexes with Timescale hypertables?
- Do Timescale hypertables use composite indexes?
You already know the answer to all of the above is “yes,” but let’s jump into more detailed answers.
What Happens to My Database Indexes When I Convert a PostgreSQL Table Into a Hypertable?
In Timescale, one must first create or use a regular PostgreSQL table to convert it into a hypertable.
What happens to the database indexes during that process?
The following is a straightforward PostgreSQL table definition we will use in our examples, including a definition of an index on
CREATE TABLE public.my_table ( "time" timestamp without time zone NOT NULL, id integer, value real ); CREATE INDEX my_table_id_index ON public.my_table USING btree (id);
At this point, we have a normal PostgreSQL table and
INDEX defined. It will look like this when viewed from psql:
index_example=> \d my_table Table "public.my_table" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+--------- time | timestamp without time zone | | not null | id | integer | | | value | real | | | Indexes: "my_table_id_index" btree (id)
The Timescale UI also shows us this information. If we navigate to the Explorer in Timescale, we’ll see this table showing as a normal PostgreSQL table (PostgreSQL tables are tagged with a blue indicator):
We can highlight the “Indexes” tab to see the PostgreSQL index:
Now, let’s convert the PostgreSQL table into a hypertable:
SELECT create_hypertable( 'my_table', 'time', chunk_time_interval => INTERVAL '1 hour', migrate_data => TRUE );
In psql, this looks a bit different now:
index_example=> \d my_table Table "public.my_table" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+--------- time | timestamp without time zone | | not null | id | integer | | | value | real | | | Indexes: "my_table_id_index" btree (id) "my_table_time_idx" btree ("time" DESC) Triggers: ts_insert_blocker BEFORE INSERT ON my_table FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
And in the Timescale UI:
The table is now converted to a hypertable, and we can also see the number of chunks (the internal units of partitioning in Timescale’s hypertables).
How does our original index look now?
The index on
id is the same. This is because Timescale is built on Postgres, not beside, instead of, parallel to, etc. Timescale fully supports generic PostgreSQL tables and
INDEX definitions as they are (see Editor's Note below). You may have also noticed that there is an additional
INDEX: the creation of the hypertable will automatically create an
INDEX for the timestamp column used in the time-series data.
So long story short, the direct answer to our original question, “What happens to a PostgreSQL
INDEX when a table is converted to a hypertable?” is… Not much! You’ll be able to access your hypertable the same way you were accessing your original PostgreSQL table, plus you will also have an additional index for the timestamp column.
What Is Going On Under the Hood?
A hypertable is an abstraction layer—to the end user, the hypertable looks like a normal PostgreSQL table. In fact, at Timescale, we want end users to view them as such so they do not need to learn a new data structure or language.
While the end user interacts with the hypertable as a normal PostgreSQL table, the Timescale software is an abstraction layer that buffers the end user for the details. Essentially, a hypertable consists of smaller tables called chunks. These chunks are actual PostgreSQL tables themselves. However, Timescale manages all chunks as one logical table.
Timescale chunks are separated by a time
INTERVAL. Each chunk contains a subset of the total data—again, based on a time
INTERVAL. Since each chunk is a separate PostgreSQL table, each chunk will have its own set of indexes.
To view individual chunk table definitions, use the
index_example=> SELECT show_chunks( 'my_table' ); show_chunks ----------------------------------------- _timescaledb_internal._hyper_1_1_chunk _timescaledb_internal._hyper_1_2_chunk _timescaledb_internal._hyper_1_3_chunk _timescaledb_internal._hyper_1_4_chunk _timescaledb_internal._hyper_1_5_chunk _timescaledb_internal._hyper_1_6_chunk _timescaledb_internal._hyper_1_7_chunk _timescaledb_internal._hyper_1_8_chunk _timescaledb_internal._hyper_1_9_chunk _timescaledb_internal._hyper_1_10_chunk
Since each chunk is a PostgreSQL table, we can view a chunk and see its indexes:
index_example=> \d _timescaledb_internal._hyper_1_1_chunk Table "_timescaledb_internal._hyper_1_1_chunk" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+--------- time | timestamp without time zone | | not null | id | integer | | | value | real | | | Indexes: "_hyper_1_1_chunk_my_table_id_index" btree (id) "_hyper_1_1_chunk_my_table_time_idx" btree ("time" DESC) Check constraints: "constraint_1" CHECK ("time" >= '2022-04-01 00:00:00'::timestamp without time zone AND "time" < '2022-04-02 00:00:00'::timestamp without time zone) Inherits: my_table
Notice that the two indexes were defined for the chunk:
index_example=> SELECT indexdef FROM pg_indexes WHERE indexname LIKE '_hyper_1_1_chunk%'; indexdef --------------------------------------------------------------------------- CREATE INDEX _hyper_1_1_chunk_my_table_id_index ON _timescaledb_internal._hyper_1_1_chunk USING btree (id) CREATE INDEX _hyper_1_1_chunk_my_table_time_idx ON _timescaledb_internal._hyper_1_1_chunk USING btree ("time" DESC)
How Do Indexes and Hypertables Work?
As you may now guess from the previous section, any
INDEX will be used with a hypertable, just as with any normal PostgreSQL table. All the query planning regarding the internal partitioning into chunks happens automatically.
If we perform a simple query, we will see the
INDEX being used in the query planner just like it would be with the normal PostgreSQL table:
index_example=> EXPLAIN SELECT AVG(value) FROM my_table WHERE id = 5 AND time BETWEEN '2022-04-01 00:00:00' and '2022-04-01 23:59:59'; QUERY PLAN ------------------------------------------------------------------------------- Aggregate (cost=28.35..28.36 rows=1 width=8) -> Index Scan using _hyper_1_1_chunk_my_table_id_index on _hyper_1_1_chunk (cost=0.29..28.29 rows=24 width=4) Index Cond: (id = 5) Filter: (("time" >= '2022-04-01 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-01 23:59:59'::timestamp without time zone))
We need to note a couple of things about the query plan:
- Timescale inherently knows which chunks to query based on the time
INTERVALwhen the hypertable was created.
- The user-added
idcolumn) is consulted to find the appropriate rows.
Timescale hides the complexity of indexing while at the same time providing the same functionality as normal PostgreSQL indexes.
How Do I Define an
INDEX Directly in a Hypertable?
If you want to define indexes directly in your hypertable (after having transformed your original PostgreSQL table into a hypertable), the process is exactly the same as normal PostgreSQL. Use
If, for some reason, the
INDEX was not created with the original PostgreSQL table, one just needs to use the same syntax with the hypertable:
CREATE INDEX my_table_id_index ON my_table(id);
Do I Need to Do Anything Differently With Indexes and Hypertables vs. Regular PostgreSQL?
No! And that is the point: use Timescale and hypertables just as one would with PostgreSQL tables!
What About Creating Indexes on Existing Large Databases?
Actually, Timescale provides a “lighter” load than normal PostgreSQL when creating a new
INDEX on an existing large table. One can use an option when creating a new
INDEX that will create indexes on a per-chunk basis, which means only locking that specific chunk! This means the entire table is not locked during a
More information on
CREATE INDEX on a hypertable can be found in our docs.
We hope this article gave you the answers you were looking for on how database indexes in PostgreSQL and Timescale work together.
Timescale expands PostgreSQL for time-series data, adding the superpowers you need for your time-series applications: high performance at scale, cost-efficiency, and time-saving features specifically designed for time series. All without ever leaving behind the PostgreSQL you already know and love. So if you’re handling time-series data, give it a try!
Sign up for a free 30-day trial (no credit card required), and start indexing!