Connection Pooling on Timescale, or Why PgBouncer Rocks
Today, Timescale announces the general availability of connection pooling on our platform. Connection poolers help scale connections and get better performance out of their database, especially for customers with many short-lived connections.
Unlike many competitors, the connection pooler, hosted PgBouncer, has dedicated infrastructure rather than being colocated with your database. Dedicated infrastructure means the pooler does not use any of the compute resources reserved for your database.
Scaling database connections is a long-standing problem in PostgreSQL. To address this, we are now offering connection pooling, which is effectively hosted PgBouncer, for our customers. This allows you to add a connection pooler to your service. Doing so gives you an additional port in your service connection string where you can connect to your service through the pooler.
As part of the pooler, you get two pools: one session pool and one transaction pool. This allows you to potentially increase performance and throughput to your service via the transaction pool while reserving session connections for trusted clients. Since we are offering PgBouncer, you get all of the amazing benefits PgBouncer has to offer, including all of the monitoring via the
If you have not considered a connection pool, you may want to. Adding a connection pool can improve your performance, especially when using transaction mode, even if you don’t need the additional connections.
This is especially true for applications that are serverless in nature or rely on AWS Lambda, like many IoT applications, or that open and close connections often, like web applications. Opening and closing connections in PostgreSQL, as we’ll explain below, creates additional strain and overhead in your database, which can be alleviated by using a connection pooler.
Besides the benefits of using a connection pooler, which you may already be familiar with, we will also talk about some of the engineering design considerations and decisions that were made as we added pooling as a part of our infrastructure.
Connection Pooling and PostgreSQL Primer
Here, I’ll give a high-level overview of connection poolers and why they can be useful in PostgreSQL. If you are already familiar with connection pooling and are more interested in how we implemented it, you can probably safely skip this section. (I like this article for a more in-depth explanation.)
Connection pooling, at a basic level, is essentially a proxy that sits between your database and your application. This proxy (the pooler) maintains persistent connections with the database as part of a connection “pool.” When the application requests a connection, the pooler lets it use one of the connections from the pool. When the application is done with the connection, the pooler adds the connection to the database back to the pool, freeing it up for when the application needs another connection.
This is fairly straightforward but doesn’t seem to provide much utility at face value and adds a layer between your application and the database. So, why do people want this? To answer that, we need to dive a bit into how PostgreSQL handles connections.
Every time Postgres receives a request for a new connection, the backend process is forked, creating a new backend process for that connection and allocating approximately 5 MB of memory for each connection. Some internal actions often performed by the Postgres server are linear with respect to the number of client backends. The higher the number of backends (connections), the longer this process can take.
Additionally, every time a new connection is opened, it takes time for the client and server to establish connections. That time overhead is wasted time within the database, especially for applications that are opening connections for very short sessions. Basically, the more connections you have, the more work you give Postgres to do for backend operations. (This can be particularly bad if there is a bug in the application that forgets to close connections, resulting in hundreds of concurrent backends!)
Postgres connections: Less is more
In short, connections in Postgres are relatively expensive, and Postgres prefers fewer long-lived connections. Postgres also does not maintain a queue of connections if there are already the maximum number of available connections created and instead throws an error.
These approaches can directly conflict with modern application development, where one often opens the connection as late as possible and closes it as soon as possible and may anticipate a queue if there are no available connections. This is where connection pooling comes in!
With connection pooling, Postgres gets what it prefers for optimal performance—fewer, long-lived connections—while the application can attempt to open many connections, and as often as it needs, without restraint. The connection pooler manages this conflict seamlessly and can maintain a queue of connections in the case where all of the connections in the pool are taken.
This allows you to use Postgres in serverless, web, and IoT applications seamlessly, as these types of applications often prefer short-lived connections, which connection pooling can provide, and increases the chance of query success by being able to wait for a connection rather than have to manage an out of connections error.
Connection pooling also helps scale applications that need to run multiple transactions per second. As the Postgres work model is a connection (session) per client, those applications have to either require Postgres to be configured with way more connections that it can handle efficiently or establish and close connections multiple times per second.
The transaction mode provides a better alternative to both of those by sharing a single connection between multiple clients, as long as each client issues transactions (which may consist of a single statement) relatively independent of each other. With that, one can achieve many thousands of transactions per second with relatively few database connections.
Transaction mode also benefits when applications are improperly configured and do not preemptively close the connection when they are done with it. In session mode, or without a pooler, this takes an entire connection slot. In transaction mode, this isn’t an issue, as the pooler releases the connection once the transaction has been completed.
“My application handles connections perfectly and doesn’t have issues, so I don’t need transaction mode.” Great! Many connection poolers let you have multiple pools. You can create a session pool for trusted clients and a transaction pool for everyone else. Win/win!
In summary, connection pooling in a Postgres context allows you to scale connections more easily and ensure your database will not pay with performance for a growing number of connections.
Evaluation of PgBouncer and Alternatives
As we started to look into connection pooling, we considered several options on the market. We knew we likely did not want to build our own, as there are many incredible open-source options. We would rather leverage an existing, tried and tested option and contribute back where we can, rather than start from scratch. This likely won’t surprise many of you, given Timescale’s deep love for PostgreSQL!
We considered many poolers. This includes Yandex’s Odyssey and pgpool-ii. Ultimately, though, we knew we probably wanted to go with PgBouncer or pgCat. PgBouncer has become the industry standard for connection pooling and is incredibly well respected in the industry. It’s known for its feature completeness and being incredibly reliable. It also uses few resources while being very fast, even while handling lots (10K+) of connections on a single process. For the features we were looking for in a pooler, PgBouncer was perfect.
pgCat is an incredibly interesting alternative. It closely mirrors PgBouncer functionality, while being written in Rust. Going with pgCat could give us much more by offering additional features like load balancing, mirroring, and multi-threading. While these things are incredibly tempting to play around with, they’re a bit beyond the scope of what we are hoping to offer for customers.
One thing pgCat offered that PgBouncer did not until recently was prepared statement support in transaction mode. However, with PgBouncer 1.21, that’s no longer the case!
Given the limited gap in functionality between PgBouncer and other poolers, we ultimately decided to use the tried and true choice—PgBouncer. That said, how we’ve architected poolers could allow us to switch to pgCat (or another pooler) seamlessly in the future. We chose to run PgBouncer in its own pod. This allows us to perform maintenance on PgBouncer without impacting the database at all.
If we colocated PgBouncer and the service, any time we wanted to make an update to just PgBouncer, we would need to restart the whole pod, including the customer database! Since PgBouncer is very lightweight, any changes to the pooler take place almost instantaneously.
How to See Timescale Connection Pooler Statistics
With PgBouncer, all connection pooling customers get access to all of the really useful read-only command line commands by default. This can be a great way to examine the statistics of your pooler. This can be particularly useful after adding it to ensure you’ve configured your connections properly and are connecting to the pools you’d like to.
For example, to use the
- First, connect to your database via the pooler.
- Then, switch to the PgBouncer database, e.g.,
- From there, you can simply run
SHOWwith whatever you’re interested in, e.g.,
Simple as that!
A really common use case of this is to connect and run the
SHOW POOLS command. This tells you the number of database connections used, the number of clients that are active (which is typically way more than the number of database connections, especially in the transaction pool), and the number of clients waiting for a connection. This lets you see firsthand how much PgBouncer is helping your database scale client connections.
How to add a connection pool to your service
You can add a connection pooler to your service in one of two ways.
- During service creation, you can toggle “enable connection pooler” in the service creation screen.
- For an already running service, you can add (or remove) a connection pooler at any time from the “Connection info” tab of the service overview by clicking “Connection pooler” and then “+ Add connection pooler.”
We are excited to offer connection pooling on Timescale! If you want to try connection pooling yourself, you can add a pooler to your service by following the steps above. If you’re not a customer, Timescale offers a 30-day free trial, no credit card required!
And don't forget to check our other launches during Cloud Week, the first of three consecutive launch weeks here at Timescale. So far, we've released:
- Live migrations: a new way to migrate your multi-terabyte PostgreSQL database to Timescale with (almost) zero downtime.
- Timescale Enterprise Tier: a solution designed for your demanding data and business needs.
- Insights: Experience in-depth database monitoring with Insights.
We're just getting started—stay tuned for more!