Database Indexes in PostgreSQL and Timescale: Your Questions, Answered

Database Indexes in PostgreSQL and Timescale: Your Questions, Answered

Database indexes are a big topic in the PostgreSQL landscape. Timescale provides impressive time-series optimizations on top of PostgreSQL: you get performance at scale, cost-efficiency, and all the time-series features you need without toiling through yet another learning curve. Timescale is just PostgreSQL—with time-series superpowers.


Still, we often get questions from customers on how indexes work with Timescale. The short answer to those questions is: database indexes in Timescale work the same as in regular PostgreSQL. But perhaps, you want to dig a bit deeper.

In this blog post, we’ll walk you through the most common questions we get on this topic:

  • Does a normal PostgreSQL INDEX work with Timescale?
  • Will creating a Timescale hypertable use the existing PostgreSQL indexes?
  • After creating the Timescale hypertable, do I use the standard CREATE INDEX method for creating new indexes?
  • Does the PostgreSQL Planner use the indexes with Timescale hypertables?
  • Do Timescale hypertables use composite indexes?

You already know the answer to all of the above is “yes,” but let’s jump into more detailed answers.

What Happens to My Database Indexes When I Convert a PostgreSQL Table Into a Hypertable?

In Timescale, one must first create or use a regular PostgreSQL table to convert it into a hypertable.

What happens to the database indexes during that process?

The following is a straightforward PostgreSQL table definition we will use in our examples, including a definition of an index on id:

CREATE TABLE public.my_table (
	"time" timestamp without time zone NOT NULL,
	id integer,
	value real
);
CREATE INDEX my_table_id_index ON public.my_table USING btree (id); 

At this point, we have a normal PostgreSQL table and INDEX defined. It will look like this when viewed from psql:

index_example=> \d my_table
                    	Table "public.my_table"
 Column |        	Type          | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |       	| not null |
 id     | integer                  	|       	|      	|
 value  | real                    	|       	|      	|
Indexes:
	"my_table_id_index" btree (id)

The Timescale UI also shows us this information. If we navigate to the Explorer in Timescale, we’ll see this table showing as a normal PostgreSQL table (PostgreSQL tables are tagged with a blue indicator):

The Explorer in the Timescale Cloud UI showing a normal PostgreSQL table


We can highlight the “Indexes” tab to see the PostgreSQL index:

The Explorer in the Timescale Cloud UI showing the table in the Indexes tab


Now, let’s convert the PostgreSQL table into a hypertable:


SELECT create_hypertable( 'my_table', 'time', chunk_time_interval => INTERVAL '1 hour', migrate_data => TRUE );

In psql, this looks a bit different now:

index_example=> \d my_table
                    	Table "public.my_table"
 Column |             Type            | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |           | not null |
 id     | integer                     |           |          |
 value  | real                        |           |          |
Indexes:
	"my_table_id_index" btree (id)
	"my_table_time_idx" btree ("time" DESC)
Triggers:
	ts_insert_blocker BEFORE INSERT ON my_table FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()

And in the Timescale UI:

The Timescale UI showing the hypertable

The table is now converted to a hypertable, and we can also see the number of chunks (the internal units of partitioning in Timescale’s hypertables).

How does our original index look now?

The Timescale UI showing the index for the hypertable

The index on id is the same. This is because Timescale is built on Postgres, not beside, instead of, parallel to, etc. Timescale fully supports generic PostgreSQL tables and INDEX definitions as they are (see Editor's Note below). You may have also noticed that there is an additional INDEX: the creation of the hypertable will automatically create an INDEX for the timestamp column used in the time-series data.

So long story short, the direct answer to our original question, “What happens to a PostgreSQL INDEX when a table is converted to a hypertable?” is… Not much! You’ll be able to access your hypertable the same way you were accessing your original PostgreSQL table, plus you will also have an additional index for the timestamp column.

Editor’s Note: While indexes are fully supported, there is a limitation on how they are implemented with Timescale hypertables. They do not support CREATE INDEX CONCURRENTLY. This, however, should not affect how you work with indexes in Timescale. As we said, it’s business as usual!

What Is Going on Under the Hood?

A hypertable is an abstraction layer—to the end user, the hypertable looks like a normal PostgreSQL table. In fact, at Timescale, we want end users to view them as such so they do not need to learn a new data structure or language.

While the end user interacts with the hypertable as a normal PostgreSQL table, the Timescale software is an abstraction layer that buffers the end user for the details. Essentially, a hypertable consists of smaller tables called chunks. These chunks are actual PostgreSQL tables themselves. However, Timescale manages all chunks as one logical table.

Timescale chunks are separated by a time INTERVAL. Each chunk contains a subset of the total data—again, based on a time INTERVAL. Since each chunk is a separate PostgreSQL table, each chunk will have its own set of indexes.

To view individual chunk table definitions, use the show_chunk() function:

index_example=> SELECT show_chunks( 'my_table' );
               show_chunks
-----------------------------------------
 _timescaledb_internal._hyper_1_1_chunk
 _timescaledb_internal._hyper_1_2_chunk
 _timescaledb_internal._hyper_1_3_chunk
 _timescaledb_internal._hyper_1_4_chunk
 _timescaledb_internal._hyper_1_5_chunk
 _timescaledb_internal._hyper_1_6_chunk
 _timescaledb_internal._hyper_1_7_chunk
 _timescaledb_internal._hyper_1_8_chunk
 _timescaledb_internal._hyper_1_9_chunk
 _timescaledb_internal._hyper_1_10_chunk

Since each chunk is a PostgreSQL table, we can view a chunk and see its indexes:

INDEX

index_example=> \d _timescaledb_internal._hyper_1_1_chunk
            Table "_timescaledb_internal._hyper_1_1_chunk"
 Column |            Type             | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |           | not null |
 id     | integer                     |           |          |
 value  | real                        |           |          |
Indexes:
    "_hyper_1_1_chunk_my_table_id_index" btree (id)
    "_hyper_1_1_chunk_my_table_time_idx" btree ("time" DESC)
Check constraints:
    "constraint_1" CHECK ("time" >= '2022-04-01 00:00:00'::timestamp without time zone AND "time" < '2022-04-02 00:00:00'::timestamp without time zone)
Inherits: my_table

Notice that the two indexes were defined for the chunk:

index_example=> SELECT indexdef
                FROM pg_indexes
                WHERE indexname LIKE '_hyper_1_1_chunk%';
                               indexdef
---------------------------------------------------------------------------
 CREATE INDEX _hyper_1_1_chunk_my_table_id_index ON _timescaledb_internal._hyper_1_1_chunk USING btree (id)
 CREATE INDEX _hyper_1_1_chunk_my_table_time_idx ON _timescaledb_internal._hyper_1_1_chunk USING btree ("time" DESC)

How Do Indexes and Hypertables Work?

As you may now guess from the previous section, any INDEX will be used with a hypertable, just as with any normal PostgreSQL table. All the query planning regarding the internal partitioning into chunks happens automatically.

If we perform a simple query, we will see the INDEX being used in the query planner just like it would be with the normal PostgreSQL table:

index_example=> EXPLAIN SELECT AVG(value)
                FROM my_table
                WHERE id = 5 AND
                      time BETWEEN '2022-04-01 00:00:00' and '2022-04-01 23:59:59';
                               QUERY PLAN
-------------------------------------------------------------------------------
 Aggregate  (cost=28.35..28.36 rows=1 width=8)
   ->  Index Scan using _hyper_1_1_chunk_my_table_id_index on _hyper_1_1_chunk  (cost=0.29..28.29 rows=24 width=4)
         Index Cond: (id = 5)
         Filter: (("time" >= '2022-04-01 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-01 23:59:59'::timestamp without time zone))

We need to note a couple of things about the query plan:

  • Timescale inherently knows which chunks to query based on the time INTERVAL when the hypertable was created.
  • The user-added INDEX (on the id column) is consulted to find the appropriate rows.

Timescale hides the complexity of indexing while at the same time providing the same functionality as normal PostgreSQL indexes.

How Do I Define an INDEX Directly in a Hypertable?

If you want to define indexes directly in your hypertable (after having transformed your original PostgreSQL table into a hypertable), the process is exactly the same as normal PostgreSQL. Use CREATE INDEX!

If, for some reason, the id INDEX was not created with the original PostgreSQL table, one just needs to use the same syntax with the hypertable:

CREATE INDEX my_table_id_index ON my_table(id);

Do I Need to Do Anything Differently With Indexes and Hypertables vs. Regular PostgreSQL?

No! And that is the point: use Timescale and hypertables just as one would with PostgreSQL tables!

What About Creating Indexes on Existing Large Databases?

Actually, Timescale provides a “lighter” load than normal PostgreSQL when creating a new INDEX on an existing large table. One can use an option when creating a new INDEX that will create indexes on a per-chunk basis, which means only locking that specific chunk! This way, the entire table is not locked during a CREATE INDEX.

More information on CREATE INDEX on a hypertable can be found in our docs.

Editor’s Note: If you want to learn more about indexes in Timescale, check out our documentation and learn how to fine-tune your PostgreSQL database performance by optimizing your database indexes.

Next Steps

We hope this article gave you the answers you were looking for on how database indexes in PostgreSQL and Timescale work together.

Timescale expands PostgreSQL for time-series data, adding the superpowers you need for your time-series applications: high performance at scale, cost-efficiency, and time-saving features specifically designed for time series. All without ever leaving behind the PostgreSQL you already know and love. So if you’re handling time-series data, give it a try!


Sign up for a free 30-day trial (no credit card required), and start indexing!

Ingest and query in milliseconds, even at terabyte scale.
This post was written by
6 min read
Cloud
Contributors

Related posts