Read Before You Upgrade: Best Practices for Choosing Your PostgreSQL Version
PostgreSQL has a long-standing reputation for having a miserable upgrade process. So, when the community heartily recommends that you should upgrade as soon as possible to the latest and greatest PostgreSQL version, it's not really surprising that your heart sinks, your mouth goes dry, and the outright dread of another laborious job takes over.
It's almost like finishing a long hike or trying to convince somebody that Betamax was better than VHS. Eventually, you just want it to be over so you can take a nap. There's not even any joy about all the new features and speed. It's just too exhausting to generate emotion anymore.
This blog post will hopefully serve as a guide for when to pull off the old band-aid. That is, when you should upgrade and what PostgreSQL version you should select as a target. By the end of this post, we will introduce you to our best practices for upgrading your PostgreSQL version in Timescale, so you can get over this process of upgrading as quickly and safely as possible.
When to Upgrade PostgreSQL: Common Myths
The PostgreSQL Global Development Group has simplified the upgrade process quite a bit with more explicit version numbering. Since there are only two external stimuli, there are only two choices: upgrade the binaries (minor version change) or upgrade the data on disk (major version change).
The developers of PostgreSQL never really had a plan in mind for when and how to upgrade. This seems a bit of a harsh statement when tools like pg_upgrade exist but bear with me. These tools were meant to make upgrades possible, not to imply any particular schedule or recommendations for an upgrade plan. The actual upgrade implementation was always left as an exercise for the administrator.
Let's start with some of the community's conventional wisdom and pretend that those ideas were actually a plan of sorts.
Myth 1: “Upgrade as fast as possible, every time”
This "plan" is based on the fear of existing bugs. It is a very Rumsfeldian plan that assumes you don't know what the bugs are, but you're certainly better off if they're fixed. This makes for a very aggressive upgrade pace and hopes for a better tomorrow rather than a stable today.
Myth 2: "Upgrade when you have to"
The complete opposite fear-based pseudo-plan is to stick to the existing version—come hell or high water—unless you run into an otherwise unfixable bug that affects your installation. This is based on the idea that the bugs we know are better than the bugs we don't know. Unfortunately, it ignores the bugs you don't even know exist.
Myth 3: “Upgrade for every minor version”
This is the general recommendation of the PostgreSQL Global Development Group. The general idea is that all software has bugs, and upgrading is better than not upgrading. That is a bit over-optimistic about new bugs being introduced and kind of ignores that new features that you don’t care about have to be configured—or else.
This comes a bit closer to planning than guessing for minor versions, as the minor versions of PostgreSQL do not change the file system; they only change the binaries. These upgrades tend to be super heavy on bug fixes and very low on new features, which is where bugs tend to get introduced. It doesn't say anything about bugs you have actually encountered, nor does it say anything about any improvements from which you might be able to benefit.
Myth 4: “Upgrade when you have time to kill”
Probably the most dangerous plan since you will never have more time in the future and will probably never upgrade. Experience says that this is a completely silly plan that never gets implemented.
Myth 5: “Upgrade when there are security fixes”
Okay, this makes some kind of sense. Unfortunately, it ignores the rest of your installation and puts the application development team into tailspin mode for your DevOps enjoyment. It is the kind of policy you end up with when the DevOps team doesn’t really care about the Apps team.
When to Upgrade PostgreSQL
Much of this guide is based on personal experience with PostgreSQL upgrades over the years. In some cases, the old was better than the new, and in others, the other way around. In some cases, the fixes worked immediately. In others, well, not so much.
Very few hard and fast rules can be drawn when coming up with a plan of this nature, but I'll try to bring the experience to bear in a way that helps to make a decision in the future. That being said, this is a "best practice" based on experience, not a "sure-fire thing."
As a way to reduce the amount of just sheer subjectivity and opinion around choosing the moment to upgrade, I've taken a look through the release notes of PostgreSQL. In this lookie-look, I've attempted to note where bug fixes occurred and mentally move them back to the version where they were discovered. Unfortunately, this task is also somewhat subjective, as I was not a part of the bug fix development or the bug discovery. So these are just educated guesses, but I hope rather good ones.
Then I looked at the mental list that I had made and thought about whether it matched my personal experience with successful versus unsuccessful upgrades. It (again) seemed a subjectively good indicator of when an upgrade succeeded or failed.
So, on to the findings.
The first thing I noticed in my research is that the biggest upgrade failures were with a new major version containing updates to the write-ahead log (WAL). These were most notable for versions 10 and 12.
Version 10 would make a book by itself. It was a major undertaking, with quite a few subsystem rewrites. In these version upgrades, there were numerous additions to items (like WAL for hash indexes), as well as improvements and changes to the background writer to support structural changes on disk. These major updates introduced the largest number of unintended behaviors, which lasted the longest before being detected and fixed.
The next most striking failures came from logical replication between 10 and 11. Of course, logical replication was invented for 10, so there had never been an attempt to use it for production upgrades before. This first use in the field was—how should I put it?—interesting.
After that, the bugs died down a lot but were never quite gone.
Here is my list of questions to ask before an upgrade.
1. How big is the change? Was it a major refactor, and did it involve any of the following?
- Query planner: minor.
- WAL: major.
- Background writer: major.
- Memory, caching, locks, or anything else managed by the parent process: minor.
- Index engine: major (or just rebuild all your indexes anyway).
- Replication: major.
- Logging: minor.
- Vacuuming: minor.
2. Were there any huge performance gains?
3. Does it include major security fixes?
4. Are there major built-in function() improvements/enhancements?
5. Do all of my extensions exist for the new version?
These are my rules of thumb for whether a new PostgreSQL version is compelling for upgrade. Unfortunately, this still requires some subjective evaluation and a bit of professional knowledge. For instance, just because vacuum is a major feature, it doesn't mean it has ever been a problem with an upgrade. It could be, though, and we should look at its major changes with a bit of a wry mouth hold.
This brings me to my personal procedure that has (so far) followed the above guidelines.
- Upgrade major versions when they reach the minor version .2. That is, 10.2, 11.2, 12.2, etc. This technique avoids the most egregious bugs introduced in major versions but still allows for staying reasonably close to the current.
- Upgrade minor versions as they are available. Minor upgrades have not created major issues thus far in my personal experience. The speed increases, bug fixes, security patches, and internationalization have been worth the minor risk.
- Upgrade immediately when you are more than two major versions behind. The pace of development of PostgreSQL will leave you in an unsupported version very quickly. Much quicker than ever before because they are committed to two versions a year. Only five major versions are supported, so your installation will be unsupported in approximately three years. That is a very short rope.
- Upgrade when the security team tells you to. It doesn't happen very often, but when it does, it's a major event.
- Upgrade because you need functionality. Things to upgrade for:
SYSTEM, and performance. Things not to upgrade for: functions(), operators, and libraries.
That's all there is to it.
I hope this blog post has helped you to make a decision for when PostgreSQL has compelling new features for you.
Of course, this is only a general rule of thumb. If you feel compelled to upgrade for some other reason, don't let my guide tell you what not to do. It only intends to help in the absence of any other stimuli for upgrade. You do you.
I Am Ready to Upgrade. Now, What?
So now you have followed the checklist above and determined that it’s time for you to upgrade your PostgreSQL version. If you’re running a production database, this may be easier said than done, especially if we are talking about upgrading your major version (e.g., from PostgreSQL 13 to PostgreSQL 14):
- Minor versions of PostgreSQL (e.g., from PostgreSQL 13 to PostgreSQL 13.2) are always backward compatible with the major version. That means that if you upgrade your production database, it is unlikely that anything is going to break due to the upgrade.
- However, major versions of PostgreSQL are not backward compatible. That means that when you upgrade the PostgreSQL version of a database behind a mission-critical application, this may introduce user-facing incompatibilities which might require code changes in your application to ensure no breakage.
Practical example: if you are upgrading from PostgreSQL 13 to 14, in PostgreSQL 14, the factorial operators ! and !! are no longer supported, nor is running the factorial function on negative numbers. What may seem like a silly example is, in fact, illustrative that assumptions made about how certain functions (or even operators) work between versions may break once you update.
Fortunately, PostgreSQL is awesome enough to provide clear Release Notes stating the changes between versions. But this doesn’t solve our problem: how to upgrade production databases safely?
Timescale to the Rescue
This is one of the many areas in which choosing a cloud database will help. If you are self-hosting your mission-critical PostgreSQL database and want to run a major upgrade, you would have first to create a copy of your database manually, dumping your production data and restoring it in another database with the same config as your production database.
Then, you would have to upgrade this database and run your testing there. This process can take a while depending on your database's size (and if we’re talking about a time-series application, it’s probably pretty big).
Timescale makes the upgrading process way more approachable. Timescale is a database cloud for time-series applications built on TimescaleDB and PostgreSQL. In other words, this is PostgreSQL under the hood—with a sprinkle of TimescaleDB as the time-series secret sauce.
Timescale databases (which are called “services”) run on a particular version of TimescaleDB and PostgreSQL:
- As a user of Timescale, you don’t have to worry about the TimescaleDB upgrades: they will be handled automatically by the platform during a maintenance window picked by you. These upgrades are backward compatible and nothing you should worry about. They require no downtime.
- The upgrades between minor versions of PostgreSQL are also automatically handled by the platform during your maintenance window. As we mentioned, these upgrades are also backward compatible. However, they require a service restart, which could cause a small (30 seconds to a few minutes) of downtime if you do not have a replica. We always alert users ahead of these in advance.
But what about upgrades between major versions of PostgreSQL? Since these are often not backward compatible, we cannot automatically upgrade your service in Timescale from, let’s say, PostgreSQL 13 to 14, which may introduce problems in your code and cause major issues!
Also, upgrading between major versions of PostgreSQL can (unfortunately but unavoidably) introduce some downtime. If you are running a mission-critical application, you want complete control over when that unavoidable downtime will occur. And you certainly want to test that upgrade first.
A database platform like Timescale can certainly help solve this issue. Upgrading your major version of Postgres will always be a decent lift—but a hosted database platform can make this process way smoother, helping you automate what can be automated and also facilitating your testing:
- In Timescale, you can upgrade the PostgreSQL version that’s running on your service by simply clicking a button.
- You can use database forks to test your upgrade safely. Also, by clicking a button, Timescale allows you to create a database fork (a.k.a. an exact copy of your database) which you can then upgrade to estimate the required downtime to upgrade your production instance.
- You can also use forks to test your application changes. Once your fork is upgraded, you can run some of your production queries—you can find some of these using
pg_stat_statements—on the fork to ensure they don’t contain any breaking changes to the new major version.
Let’s explore this more in the next section. If you’re not using Timescale, you can create a free account here—you’ll have free access for 30 days, no credit card required.
Safely Upgrading Major PostgreSQL Versions in Timescale
Here’s how you can safely upgrade your Timescale service:
- First, fork your service. Timescale allows you to fork (a.k.a. copy) your databases in one click—a fast and cost-effective process. You will only be charged when your fork runs, and you can immediately delete it after your testing is complete.
- Now that you have a perfect copy of your production database ready for testing (with the click of a button), it’s time to click another button to tell the platform to upgrade your major PostgreSQL version automatically. You can do this in Timescale—we’ll tell you exactly how in a minute.
- Once the upgrade is complete in your fork, run your tests.
- In order to see how long the upgrade took on the fork, you can go to your metrics tab and check how long your service was unavailable (the grey zone in your CPU and RAM graphs). This will give you an estimate as to how long your primary service will be down when you choose to upgrade it.
- When you’re sure that nothing breaks, you can upgrade your primary service. Make sure to plan accordingly! Upgrading will cause downtime, so make sure you have accounted for that as a part of your upgrade plan.
Let’s see how this looks in the console.
First, check which TimescaleDB and PostgreSQL version your database is running on your service Overview page.
To fork your service is as easy as going to the Operations tab and clicking on the Fork service option. This will automatically create an exact snapshot of your database.
To upgrade your major version of PostgreSQL, go to your Maintenance tab. Under Service upgrades, you will see a Service upgrades button. If you click that button, your service will be updated to the next major version of Postgres (in the example below, the service would be upgraded from PostgreSQL 13.7 to PostgreSQL 14).
Your Upgrade Is Complete
That’s it! You can now use the latest and greatest that PostgreSQL has to offer. That said, choosing to upgrade is no small feat. Before going through the upgrade process, there is a lot to consider, and it is important to have a plan to account for the downtime you will experience.
While the upgrade process can be a bit painful, you can at least rely on Timescale to handle the technical orchestration of the upgrade. In the future, we hope to offer even better tooling to make the upgrade process entirely pain-free (but we have to walk before we can run, right?).
If you’d like to see what Timescale has to offer, start a free trial if you haven’t already. There’s no credit card required!