“Yes,” your mind may go, “I might be able to improve my performance if I partition my tables, but this will be at the cost of countless hours spent on manual configurations, running maintenance jobs, testing, and not to mention the unforeseen issues that might pop up during scaling. It’s like having a potent car with an incredibly complicated gearbox.” If you’re using vanilla PostgreSQL in products like Amazon RDS, there’s a lot of truth to this. You will undoubtedly spend much of your time managing your partitioned tables. Plus, you’ll have to deal with custom scripts, keep rigorous maintenance practices, and carefully monitor your performance to revisit and tweak your configuration whenever you see changes in your dataset or ingestion rate.
But guess what: there’s a better way of creating a Postgres partition, and it’s called hypertables.
Meet Hypertables: Automatic PostgreSQL Partitioning for Your Large PostgreSQL Tables
Hypertables (which are available via the TimescaleDB extension and, in AWS, via the Timescale platform) are an innovation that makes the experience of creating a Postgres partition completely seamless. They automate the generation and management of data partitions without changing your user experience.
Working with a hypertable feels exactly like working with a regular PostgreSQL table. But, under the covers, hypertables create all the partitioning magic, speeding up your queries and ingests. This performance boost will sustain as your tables' volume keeps growing, making hypertables extremely scalable.
Hypertables look like regular PostgreSQL tables, but under the hood, they’re being automatically partitioned to enhance performance
Hypertables are optimized for time-based partitioning, so this is the type of partitioning that we’ll focus on in this article. However, hypertables also work for tables that aren’t time-based but have something similar, for example, a BIGINT primary key.
Let’s explain how hypertables work with an example.
Imagine you have a PostgreSQL table called sensor_data, where data from various IoT devices is stored with a timestamp. The table might look something like this:
Now, as the volume of sensor_data grows, you start facing performance issues and management complexities. Here’s where hypertables come to help. If you were using Timescale, the only thing you’d need to do is convert your sensor_data table into a hypertable:
What hypertables do instead is encapsulate and automate all these steps, significantly reducing the complexity, manual effort, and potential for errors on your end:
With hypertables, there’s no need to create a parent table manuallyand to define child tables for each time range. You would simply convert your existing table into a hypertable.
Hypertables also simplify indexing. When you create an index on a hypertable, Timescale automatically creates the corresponding indexes on all current and future partitions, ensuring consistent query performance without manual adjustments.
Hypertables automatically create new partitions on the fly based on the specified time interval. As new data is ingested, appropriate partitions are ready to store the data without manual intervention or scheduled jobs. Using Timescale eliminates the risk of partitions not existing, completely removing partition management from your to-do list.
Once your PostgreSQL table becomes a hypertable, you can keep querying it as usual. You will instantly experience a performance boost. When you execute a query, Timescale’s query planner intelligently routes the query to the appropriate partition(s), ensuring that only relevant data is scanned. This process remains completely transparent; you don't need to think about it or worry about which partition contains which data.
Partitioning Is Only the Beginning: Features Unlocked With Hypertables
Hypertables make partitioning seamless and unlock a wealth of features that will help you improve your PostgreSQL performance even further and save you time when managing your data.
A few examples:
Columnar compression for faster queries and cheaper storage. By enabling Timescale compression, your hypertable will change from row to column-oriented. This can reduce storage usage by up to 95 % and unlock blazing-fast analytical queries while allowing the data to be updated.
Continuous aggregates automatically refresh and store aggregated data, enabling you to build fast visualizations, including real-time insights and historical analytics that go back in time.
Easy and configurable data retention. Hypertables allow you to set up automatic data retention policies with one simple command:add_retention_policy. You can just tell Timescale when you want your data dropped, and your hypertables will automatically drop outdated partitions when it’s time.
SQL hyperfunctions to run analytics with fewer lines of code. Hypertables come with a full set of hyperfunctions that give you a blazing-fast full set of mathematical analytical functions, procedures, and data types optimized for effectively querying, aggregating, and analyzing large volumes of data.
An engineering team at a leading energy company is tasked with managing the data from a newly installed smart grid, a big investment for the energy company, which now has granular insights into energy consumption, distribution efficiency, and grid health metrics. Elements in the smart grid generate thousands of energy metrics per second that need to be properly collected, analyzed, and managed.
These energy metrics are currently stored in PostgreSQL, but the engineering team has to figure out the best solution to ingest this high-velocity data efficiently without losing granularity or accuracy. They must also ensure they can query this data quickly for real-time monitoring and analysis.
This would be an ideal use case for Timescale:
Timescale’s hypertables can handle the high ingestion without imposing manual work on the team.
Hypertables also optimized query performance, ensuring that real-time energy data will be readily accessible for queries.
As the smart grid expands, Timescale's hypertables will seamlessly scale, accommodating increased data volumes without compromising performance.
Given that Timescale is built on PostgreSQL, the engineering team can leverage their existing knowledge and tools, ensuring a smooth transition and minimal learning curve.
-- Creating a table to store energy metrics from the smart gridCREATETABLE energy_metrics (
event_time TIMESTAMPTZ NOTNULL,
frequency DECIMALNOTNULL,PRIMARYKEY(grid_id, event_time));
-- Converting the energy_metrics table into a hypertableSELECT create_hypertable('energy_metrics','event_time');
-- Sample query to ingest new metrics data into the hypertableINSERTINTO energy_metrics (element_id, event_time, voltage,current, frequency)VALUES(1,NOW(),210.5,10.7,50.01);
-- Sample query to retrieve the latest energy metrics for real-time monitoringSELECT*FROM energy_metrics
WHERE element_id =1ORDERBY event_time DESCLIMIT10;
Building dashboards for monitoring sensor data
An industrial manufacturing company operates a range of heavy machinery and equipment in its facilities. Each piece of machinery is equipped with sensors that continuously monitor and log temperature data in a sensor_data table to ensure optimal performance and safety.
The company needs its PostgreSQL database to achieve two distinct yet critical objectives:
Provide engineers and maintenance staff with real-time temperature data to detect anomalies and ensure that machinery is operating within safe temperature ranges.
Analyze historical temperature data to identify trends, predict maintenance needs, and improve operational efficiency.
The team decides to turn sensor_table into a hypertable. To facilitate real-time monitoring, they create a continuous aggregate to calculate the average temperature for every piece of machinery, which is updated every minute:
With real_time_avg_temp, the maintenance team has immediate access to the average temperature of every machinery piece, enabling swift responses to temperature anomalies and preventing potential breakdowns.
For historical analysis, the team creates another continuous aggregate view, this time aggregating daily average temperatures:
Both views (real_time_avg_temp and daily_avg_temp) feed into a monitoring dashboard. The maintenance team would get alerted of potential issues as they arise. At the same time, the team can review historical temperature trends, conduct analyses to predict when machinery might need maintenance, and optimize operational protocols to mitigate excessive temperature fluctuations.
Storing large volumes of weather data effectively
An environmental research institute is collecting and analyzing many TBs of weather data to study climate change. The team already knows PostgreSQL, so they want to stick to it—but the storage cost is becoming a concern.
The team decides to start using Timescale. After optimizing their database to reduce storage use and enabling compression, their storage costs become a fraction of what they were, with the data remaining fully accessible for analysis.
A new crypto exchange is grappling with the challenge of providing real-time analytics to traders. As the data volume stored in the underlying PostgreSQL database increases, the engineering team struggles to keep the database fast enough. To them, it’s essential to deliver a better user experience than the competition, which has a more established but slower product. Keeping up the speed and responsiveness of their portal is paramount.
The team knows that by partitioning their large pricing table, they’ll most likely improve query performance. Instead of attempting to manage partitioning themselves, since they’re already swamped, the engineers decide to implement Timescale.
PostgreSQL partitioning is a powerful tool for managing large tables. On its own, Postgres partitioning can be complex to implement and maintain, but Timescale's hypertables make the whole process seamless and automatic. The best part is that by using hypertables, you’ll unlock a myriad of other awesome features (like columnar compression and automatic materialized views) that will make it even easier to scale your PostgreSQL database.