TimescaleDB 1.1 performance optimizations and PG11 support (beta)
New release includes beta support for PG11, as well as built-in optimizations and open-source tooling to improve performance.
Recently, we announced that TimescaleDB is production ready and is the first enterprise-ready time-series database to support full SQL and scale. To get to this point, our team spent over two years of dedicated engineering effort to harden the database, ensuring stability, ease of use, and reliability.
Today, we are excited to announce TimescaleDB 1.1 with new features focused on enhancing and simplifying the user’s experience and beta support for PostgresSQL 11.
Beta support for PG11
PostgreSQL 11 was released earlier this fall, and we now support it! Those of you who know PostgreSQL should be very excited right now because version 11 adds some pretty awesome features. A few of my favorites include:
- Covering indexes allows you to include unindexed columns in an index, which seems a little counterintuitive at first, but can be very helpful when you want to enforce uniqueness on a primary key, but also include an extra column that is not included in the primary key in the index in order to allow index-only scans. Until PG11, you had to create two indexes: one to enforce the primary key and another to make sure your index only scans worked well. We find covering indexes to be particularly useful when storing and querying time-series data where data is stored in a narrow table (e.g. timestamp, id, value), ingested in time-order, but then queried by either device or metric id.
- JIT compilation can speed long running queries dramatically by compiling them to byte code while the query is running. Postgres tends to have a bit of extra overhead for function calls due to the organization of the executor. Timescale doesn’t yet use JIT to make our functions into byte code, but we can still take advantage of all of the normal Postgres functions that can be JITted. We’re planning on taking a look at how much this can affect long running queries in our benchmarks, but we’d also love to hear from users who have seen improvements!
- Parallel Append means that individual chunks will actually be scanned in parallel! Before, parallel scans happened only within a chunk, but not between chunks. Now, parallel scans will actually send workers to scan multiple chunks at the same time, which can reduce cpu contention and significantly reduce io contention when chunks are on separate tablespaces.
- `IN` and `ANY` queries, often used to query a time-series table for a given array of values, now smartly exclude unneeded chunks. For instance, in IoT, you might query metrics for the last month for device1, device2 and device5. Or in a SaaS application, you might similarly query usage metrics for the last month for customer ids 1, 5, 20 and 47. One thing to note is that this particular query optimization does not apply to subqueries.
- first() and last() queries now leverage indexes to return results without scanning the whole table when possible. This is a common query for monitoring, where you might want to view the last (or current) metric for a given set of devices. We don’t yet support index scans on first and last queries with group-bys even if there is an index that might support such a scan. We do plan on optimizing that sort of scan in the future.
[Special thanks to TimescaleDB Software Engineer Niksa Jakovljevic]
While it’s easy to get up and running with TimescaleDB if you already have a PostgreSQL installation, getting Postgres set up for the first time can be a bit harder. In this release, we addressed two common user requests: making it easier to tune PostgreSQL to optimize performance and providing a faster option for getting started on Amazon.
The default PostgreSQL configurations are a bit notorious for being, shall we say, conservative. They’re essentially made so that if you install Postgres on a Raspberry Pi, it will work! And it (usually) won’t OOM! This, however, sometimes leads to folks wondering why their 32 core server isn’t really achieving much better performance than a Raspberry Pi. Prior to timescale-tune, users had to manually tweak the PostgreSQL configuration file to fully leverage available hardware and get the most out of TimescaleDB.
To simplify this process for users, we’ve created a command-line tool that can be invoked after installation that “lints” a user’s configuration to make sure it’s ready to go. The tool, called timescaledb-tune (GitHub), helps users initially setup their postgresql.conf file with reasonable settings for memory, parallelism, the WAL, etc. With 1.1, we will be soft releasing this tool packaged with Debian and Ubuntu releases. Since this is an early version, users should consider this tool to still be in beta, although the tool does require user acceptance before it actually writes any changes to the postgresql.conf file.
[Special thanks to TimescaleDB Software Engineer Rob Kiefer]
Getting started with our Amazon AMI
As an open-source cloud-agnostic database, we also wanted to support a smooth onboarding experience for users looking to try us out on the cloud. We started with Amazon by providing a template AMI that users can install quickly using pre-configured settings. You can try it out for yourself by following our instructions in our Docs.
[Special thanks to TimescaleDB Software Engineer Lee Hampton]
As we develop TimescaleDB, we are also continuously working to improve our packaging, installation, and onboarding experience for users. If you have any feedback for us, we encourage you to get in touch via our Slack community.
If you are new to TimescaleDB and ready to get started, follow the installation instructions. If you are looking for enterprise-grade support and assistance, please let us know. Finally, if you are interested in helping us build the next great open-source company, we are hiring!
Like this post? Interested in learning more? Follow us on Twitter or sign up for the community mailing list below!