Introducing One-Click Database Forking in Timescale
[Note: This blog post was originally published in December 2021 and updated in March 2022 to reflect new functionality released as part of #AlwaysBeLaunching Cloud Week 🐯 ☁️.]
Announcing improved support for database forking in Timescale. Create a fork of your database, with the same or different resource configurations, in just a few clicks. Forks allow you to conveniently create testing and staging environments, safely test major upgrades, downsize your service, give access to production data while fully isolating the production database, and much more. 🔥
We’re excited to continue #AlwaysBeLaunching MOAR Edition—a week full of exciting new features for Timescale, bringing you MOAR features that make Timescale even MOAR worry-free, scalable, and flexible!
Today, we’re releasing improved one-click database forking in Timescale, with new functionality that allows you to easily spin up forks— identical copies of your database—but now with different resource configurations (CPU, memory, or disk) than the primary database.
One-click database forking on Timescale gives development teams the ability to perform a variety of important tasks in less time and with more flexibility: what previously took a considerable amount of manual work can now be done in one or two clicks.
Database forking is useful for dev, test, and staging environments, including testing for performance regression, application changes, or safe database upgrades. It similarly allows you to easily evaluate the impact of schema, index, or configuration changes outside your production database. That is, forks help grow your confidence with any in-place production changes.
In addition to only taking one or two clicks to create, database forking on Timescale is cost-effective and significantly reduces the risk of testing on your own production database. Forked databases are billed at an hourly rate, so you only pay for the time your forked instance was active. This means that you can safely test the impact of PostgreSQL version upgrades, changes in application code, a new database INDEX, and more for just pennies per hour.
Moreover, with the new ability to upsize or downsize your database fork, you have greater control over resource consumption. For example, if turning on compression reduces your Timescale storage consumption by 90 %, you can now fork into a service with much smaller configured storage, leading to lower costs. Or because forking is a seamless way to grant access to your production data (but not to the production database) to data science or business intelligence teams without needing a separate ETL job or data pipeline, you can easily create forks with more or less resources depending on the needs of their projects.
To fork a service in Timescale, follow these simple steps:
- Select the database service you’d like to fork.
- Navigate to the “Operations” tab, and click the “Fork service” button.
- Choose your configuration. To create a fork with the same configuration as the parent service, click on “Create fork.” To fork with a different configuration as the parent service, click instead on “Advanced options,” which will allow you to select your forked service's compute and disk size.
After the time of forking, your forked service will be a completely independent database. While it inherits the settings of its parent, any subsequent changes in the forked service won’t be reflected in the parent database (and vice versa). In the Timescale UI, you will see which database service was forked from.
Database forking is available to all users of Timescale. If you’re new to Timescale, you can create a free account to get started (100 % free for 30 days and no credit card required).
Have questions or feedback? Join our community! You can chat with us in our Community Slack, and for in-depth technical questions, you can use the Timescale Forum. Feel free to ask us anything!
Keep reading for more information on how forking works under the hood, get some ideas for how to simplify your development workflows, and get insights on our roadmap for forking in Timescale.
A huge thank you to all the engineers and designers who worked on this feature.
The Many Uses of Database Forking
The ability to easily fork your database comes in handy for multiple scenarios. Here are some common ones:
Testing. If your team needs a common image of your database for running correctness or performance regression tests, we recommend leveraging forks in the following way. (1) Create a "golden image" of your database by creating schemas, loading data, etc., but then (2) pause your service so that you'll only pay for storage rather than compute costs, and you can prevent unwanted modifications. (3) Now create a fork of your paused database service, and (4) run any testing against the running service. (5) Finally, once your team is finished, delete the fork. You can repeat this process as regularly as you need in your testing environment. And if you ever want to tweak your base service, just resume it against and treat it like a normal database.
Moreover, you can test the impact of different workloads on your database and find the optimal resource configuration to deal with them, helping you reduce uncertainty for seasonal events. For example, you might anticipate a high load event on your database but aren't sure exactly what resources you'd need. You can create forks with different resource configurations and stress test against them to see which configuration performs best.
Safer database upgrades. When doing a major upgrade in a production database (e.g., upgrading from PostgreSQL 12 to 13), we recommend forking your service first. This way, you can perform the upgrades on the forked service first, ensuring that there are no issues related to this change. Once you’re sure the upgrade was successful, you can be confident that everything will work well when running it on your production service.
Downsize your service. Forks also allow you to easily downsize your service to eliminate unnecessary costs. In certain situations, you may find yourself paying for a significant amount of unused storage; perhaps your data size has been reduced considerably after enabling or optimizing TimescaleDB compression, or perhaps you’re seeing less traffic than originally predicted. If that happens to be the case, you can conveniently downsize your service using forks, i.e., (1) you can create a fork of your original service, assigning it less disk size and/or compute, and (2) connect your application to the fork, which now acts as your primary database. Once your old and more expensive service is no longer active, you can delete it. And if your data volume increases, you can scale up your service again by simply navigating to the “Resources” tab and selecting a larger storage plan.
Create and refresh staging environments. An important aspect of a good testing procedure is having a staging environment with production data so you can test the quality of your new features in real-world conditions. Through database forking, you can spin up an exact copy of your production data without affecting the actual production service. Also, as your production data changes over time, it is good practice to refresh your staging service, as the conditions may change—to have an easy procedure for forking makes this task painless.
Provide access to production data (but not to the production database). Many times, teams of data scientists or business intelligence analysts might want access to production data to query and analyze. Database forking enables you to provide access to production data, without having to provide access to the database itself. This is especially useful for cases where implementing access control via PostgreSQL might be too complicated. Simply fork the database and provide teams access to a copy of that data and a dedicated connection string while your production database continues to operate unaffected.
You can also give the forked database a different set of resources than the original database, depending on the needs of other teams it's being shared with and the duration of their projects. For example, you could create a fork with more CPU/RAM resources if your data science or BI team is doing a short-term project with heavy analytical or OLAP queries—this helps those teams move quickly as they won't have to wait as long for queries to execute while leaving your production database and its operations unaffected.
Create a snapshot of your data. Keeping database snapshots can be very useful for auditing and reporting, and also for doing potential analysis or forensics after carrying out an important change in your service configuration.
Forking: Under the Hood
Every service running in Timescale has a backup that we regularly test. This backup is more than a snapshot of the data directory: powered by the continuous archiving feature of PostgreSQL, it contains all database changes at a given point in time. It can be used to restore a database even when the original volume containing the data directory is unavailable.
To fork a service in Timescale, instead of restoring the backup of the parent service in place, it is restored to a new instance (the fork) that becomes a clone of the original one.
See our blog post on continuous backup/restore validation on Timescale for more details.
At Timescale, we like to move fast (without breaking things) so this is only the beginning for database forking in Timescale. In the near future, we will release the following functionality:
- Forking to a different region than the parent service.
- Forking from an arbitrary point in time (PIT)—other than the latest—so you can fork to older states of your database.
- A programmatic API to automate forking.
- One-click forking for multi-node services. Timescale fully supports multi-node deployments, but for now, forking is only available for single-node databases.
Stay tuned, and let us know if there’s more forking functionality you’d like to see.
And may the forks be with you!
Database forking is available to all users of Timescale. Check out our documentation for more information on forking.
If you are new to Timescale:
- Create an account to get free access to Timescale for 30 days, with no credit card required—and start forking today!
- Join the Timescale Community Slack and ask us any questions about time-series data, TimescaleDB, PostgreSQL, and more. Join us: we are 8,000+ and counting!
- Read our vision for Timescale: a database cloud for relational and time-series workloads, built on PostgreSQL and architected around our vision of a modern cloud service: easy, scalable, familiar, and flexible.
- Check out our Getting Started documentation. These articles walk you through the basics: creating your first instance, accessing your database, loading your data, and so on. Get familiar with key TimescaleDB concepts like hypertables and chunks, compression, or continuous aggregates. Understanding these key features will allow you to use Timescale to its full potential. ✨
And, for those who share our mission of serving developers worldwide 🌏 and want to join our fully remote, global team, we are hiring broadly across many roles!