Helm Charts for Deploying TimescaleDB on Kubernetes

As developers, we know that it’s critical to bring the best software and customer experience to market as fast and reliably as possible. At Timescale, we’re committed to building the best time-series database for all developers and scenarios, which is why our team built the ability to deploy TimescaleDB alongside other cloud-native technologies.

As a first step, we’ve released Helm Charts to help with deploying TimescaleDB on Kubernetes, which we will cover in this post. Download them today from GitHub (released under the Apache 2 open-source license).

Enter Kubernetes

As orchestration technology allows us to embrace cloud-native architectures, Kubernetes is a (disputably the) core piece of managing and orchestrating microservice architectures. Early in my tenure with Timescale, we decided to invest in TimescaleDB’s native support for such technology (hat tip to Timescale engineers Feike Steenbergen, Ian Davis, and the rest of our cloud team for the heavy lifting).

To start to turn our strategy into reality, the TimescaleDB Cloud team developed a set of Helm Charts, which allows you to quickly and easily add TimescaleDB to your Kubernetes deployment alongside other cloud-native technologies.

Let's take a closer look at what deployment using TimescaleDB Helm Charts looks like:

In the diagram above, we are deploying the service with three pods, a primary, and two replicas. Our Helm charts also deploy the Patroni agent—originally co-developed by Feike—to facilitate automatic leader election and failover amongst replicas for high availability. If the primary ever becomes unavailable, a replica will automatically take over as the new primary, and the cluster will automatically reconfigure itself to avoid any downtime.

This (the diagram above) represents our single-primary deployment option, which generates built-in availability via nodes for failover and the ability to horizontally scale the read operations.

Note: to ease any storage-based I/O contention, we are deploying both a data volume and write-ahead log (WAL) volume for each node. Disk volumes are also remote from the actual database pod (e.g., as EBS volumes on AWS) so that even if the pod fails, k8s can automatically bring up a new instance of the database, reconnect it to the detached disk volumes, recover the database, and reintegrate it into the TimescaleDB cluster. All automatically.

Now let's take a look at what TimescaleDB will look like if we use a hosted Kubernetes service, like Amazon Elastic Kubernetes Service (EKS):

Again, you see our three-node deployment with a primary and two replicas deployed. However, we’ve built a few things into the deployment that are worth mentioning.

When deploying on AWS EKS:

The pods will be scheduled on nodes that run in different Availability Zones (AZs).
An AWS Elastic Load Balancer (ELB) is configured to handle routing incoming traffic to the Master pod.

When configured for Backups to S3:

Each pod will also include a container running pgBackRest.
By default, two CronJobs are created to handle full weekly and incremental daily backups.
The backups are stored in an S3 bucket.

As you can see, we have built this out so that you are able to deploy to a cloud-native service and leverage all that it has to offer (i.e., availability zones, backup, storage, etc.).

What’s next?

As we mentioned above, building the Helm Charts was just the first step, and you can check them out today via our public GitHub. This project is currently in active development.

We’d love the Timescale Community’s feedback on the progress we’ve made so far. To do this, please either: