Database Indexes in PostgreSQL and Timescale Cloud: Your Questions, Answered

Database Indexes in PostgreSQL and Timescale Cloud: Your Questions, Answered

Database indexes are a big topic in the PostgreSQL landscape. Timescale Cloud provides impressive time-series optimizations on top of PostgreSQL: you get performance at scale, cost-efficiency, and all the time-series features you need without toiling through yet another learning curve. Timescale Cloud is just PostgreSQL—with time-series superpowers.

Still, we often get questions from customers on how indexes work with Timescale Cloud. The short answer to those questions is: database indexes in Timescale Cloud work the same as in regular PostgreSQL. But perhaps, you want to dig a bit deeper.

In this blog post, we’ll walk you through the most common questions we get on this topic:

  • Does a normal PostgreSQL INDEX work with Timescale Cloud?
  • Will creating a Timescale Cloud hypertable use the existing PostgreSQL indexes?
  • After creating the Timescale Cloud hypertable, do I use the standard CREATE INDEX method for creating new indexes?
  • Does the PostgreSQL Planner use the indexes with Timescale Cloud hypertables?
  • Do Timescale Cloud hypertables use composite indexes?

You already know the answer to all of the above is “yes,” but let’s jump into more detailed answers.

What Happens to My Database Indexes When I Convert a PostgreSQL Table Into a Hypertable?

In Timescale Cloud, one must first create or use a regular PostgreSQL table to convert it into a hypertable.

What happens to the database indexes during that process?

The following is a straightforward PostgreSQL table definition we will use in our examples, including a definition of an index on id:

CREATE TABLE public.my_table (
	"time" timestamp without time zone NOT NULL,
	id integer,
	value real
);
CREATE INDEX my_table_id_index ON public.my_table USING btree (id); 

At this point, we have a normal PostgreSQL table and INDEX defined. It will look like this when viewed from psql:

index_example=> \d my_table
                    	Table "public.my_table"
 Column |        	Type          | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |       	| not null |
 id     | integer                  	|       	|      	|
 value  | real                    	|       	|      	|
Indexes:
	"my_table_id_index" btree (id)

The Timescale Cloud UI also shows us this information. If we navigate to the Explorer in Timescale Cloud, we’ll see this table showing as a normal PostgreSQL table (PostgreSQL tables are tagged with a blue indicator):

The Explorer in the Timescale Cloud UI showing a normal PostgreSQL table


We can highlight the “Indexes” tab to see the PostgreSQL index:

The Explorer in the Timescale Cloud UI showing the table in the Indexes tab


Now, let’s convert the PostgreSQL table into a hypertable:


SELECT create_hypertable( 'my_table', 'time', chunk_time_interval => INTERVAL '1 hour', migrate_data => TRUE );

In psql, this looks a bit different now:

index_example=> \d my_table
                    	Table "public.my_table"
 Column |             Type            | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |           | not null |
 id     | integer                     |           |          |
 value  | real                        |           |          |
Indexes:
	"my_table_id_index" btree (id)
	"my_table_time_idx" btree ("time" DESC)
Triggers:
	ts_insert_blocker BEFORE INSERT ON my_table FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()

And in the Timescale Cloud UI:

The Timescale UI showing the hypertable

The table is now converted to a hypertable, and we can also see the number of chunks (the internal units of partitioning in Timescale Cloud’s hypertables).

How does our original index look now?

The Timescale UI showing the index for the hypertable

The index on id is the same. This is because Timescale Cloud is built on Postgres, not beside, instead of, parallel to, etc. Timescale Cloud fully supports generic PostgreSQL tables and INDEX definitions as they are (see Editor's Note below).  You may have also noticed that there is an additional INDEX: the creation of the hypertable will automatically create an INDEX for the timestamp column used in the time-series data.

So long story short, the direct answer to our original question, “What happens to a PostgreSQL INDEX when a table is converted to a hypertable?” is… Not much! You’ll be able to access your hypertable the same way you were accessing your original PostgreSQL table, plus you will also have an additional index for the timestamp column.

Editor’s Note: While indexes are fully supported, there is a limitation on how they are implemented with Timescale Cloud hypertables. They do not support CREATE INDEX CONCURRENTLY. This, however, should not affect how you work with indexes in Timescale Cloud. As we said, it’s business as usual!

What Is Going On Under the Hood?

A hypertable is an abstraction layer—to the end user, the hypertable looks like a normal PostgreSQL table. In fact, at Timescale, we want end users to view them as such so they do not need to learn a new data structure or language.

While the end user interacts with the hypertable as a normal PostgreSQL table, the Timescale Cloud software is an abstraction layer that buffers the end user for the details. Essentially, a hypertable consists of smaller tables called chunks. These chunks are actual PostgreSQL tables themselves. However, Timescale Cloud manages all chunks as one logical table.

Timescale Cloud chunks are separated by a time INTERVAL. Each chunk contains a subset of the total data—again, based on a time INTERVAL. Since each chunk is a separate PostgreSQL table, each chunk will have its own set of indexes.

To view individual chunk table definitions, use the show_chunk() function:

index_example=> SELECT show_chunks( 'my_table' );
               show_chunks
-----------------------------------------
 _timescaledb_internal._hyper_1_1_chunk
 _timescaledb_internal._hyper_1_2_chunk
 _timescaledb_internal._hyper_1_3_chunk
 _timescaledb_internal._hyper_1_4_chunk
 _timescaledb_internal._hyper_1_5_chunk
 _timescaledb_internal._hyper_1_6_chunk
 _timescaledb_internal._hyper_1_7_chunk
 _timescaledb_internal._hyper_1_8_chunk
 _timescaledb_internal._hyper_1_9_chunk
 _timescaledb_internal._hyper_1_10_chunk

Since each chunk is a PostgreSQL table, we can view a chunk and see its indexes:

INDEX

index_example=> \d _timescaledb_internal._hyper_1_1_chunk
            Table "_timescaledb_internal._hyper_1_1_chunk"
 Column |            Type             | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 time   | timestamp without time zone |           | not null |
 id     | integer                     |           |          |
 value  | real                        |           |          |
Indexes:
    "_hyper_1_1_chunk_my_table_id_index" btree (id)
    "_hyper_1_1_chunk_my_table_time_idx" btree ("time" DESC)
Check constraints:
    "constraint_1" CHECK ("time" >= '2022-04-01 00:00:00'::timestamp without time zone AND "time" < '2022-04-02 00:00:00'::timestamp without time zone)
Inherits: my_table

Notice that the two indexes were defined for the chunk:

index_example=> SELECT indexdef
                FROM pg_indexes
                WHERE indexname LIKE '_hyper_1_1_chunk%';
                               indexdef
---------------------------------------------------------------------------
 CREATE INDEX _hyper_1_1_chunk_my_table_id_index ON _timescaledb_internal._hyper_1_1_chunk USING btree (id)
 CREATE INDEX _hyper_1_1_chunk_my_table_time_idx ON _timescaledb_internal._hyper_1_1_chunk USING btree ("time" DESC)

How Do Indexes and Hypertables Work?

As you may now guess from the previous section, any INDEX will be used with a hypertable, just as with any normal PostgreSQL table. All the query planning regarding the internal partitioning into chunks happens automatically.

If we perform a simple query, we will see the INDEX being used in the query planner just like it would be with the normal PostgreSQL table:

index_example=> EXPLAIN SELECT AVG(value)
                FROM my_table
                WHERE id = 5 AND
                      time BETWEEN '2022-04-01 00:00:00' and '2022-04-01 23:59:59';
                               QUERY PLAN
-------------------------------------------------------------------------------
 Aggregate  (cost=28.35..28.36 rows=1 width=8)
   ->  Index Scan using _hyper_1_1_chunk_my_table_id_index on _hyper_1_1_chunk  (cost=0.29..28.29 rows=24 width=4)
         Index Cond: (id = 5)
         Filter: (("time" >= '2022-04-01 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-01 23:59:59'::timestamp without time zone))

We need to note a couple of things about the query plan:

  • Timescale Cloud inherently knows which chunks to query based on the time INTERVAL when the hypertable was created.
  • The user-added INDEX (on the id column) is consulted to find the appropriate rows.

Timescale Cloud hides the complexity of indexing while at the same time providing the same functionality as normal PostgreSQL indexes.

How Do I Define an INDEX Directly in a Hypertable?

If you want to define indexes directly in your hypertable (after having transformed your original PostgreSQL table into a hypertable), the process is exactly the same as normal PostgreSQL. Use CREATE INDEX!

If, for some reason, the id INDEX was not created with the original PostgreSQL table, one just needs to use the same syntax with the hypertable:

CREATE INDEX my_table_id_index ON my_table(id);

Do I Need to Do Anything Differently With Indexes and Hypertables vs. Regular PostgreSQL?

No! And that is the point: use Timescale Cloud and hypertables just as one would with PostgreSQL tables!

What About Creating Indexes on Existing Large Databases?

Actually, Timescale Cloud provides a “lighter” load than normal PostgreSQL when creating a new INDEX on an existing large table. One can use an option when creating a new INDEX that will create indexes on a per-chunk basis, which means only locking that specific chunk! This means the entire table is not locked during a CREATE INDEX.

More information on CREATE INDEX on a hypertable can be found in our docs.

Editor’s Note: If you want to learn more about indexes in Timescale Cloud, check out our documentation and best practices for indexing.

Next Steps

We hope this article gave you the answers you were looking for on how database indexes in PostgreSQL and Timescale Cloud work together.

Timescale Cloud expands PostgreSQL for time-series data, adding the superpowers you need for your time-series applications: high performance at scale, cost-efficiency, and time-saving features specifically designed for time series. All without ever leaving behind the PostgreSQL you already know and love. So if you’re handling time-series data, give it a try!


Sign up for a free 30-day trial (no credit card required), and start indexing!

The open-source relational database for time-series and analytics.
Try Timescale for free
This post was written by
6 min read
Cloud
Contributors

Related posts