PostgreSQL Performance Tuning, Part I: How to Size Your Database
You launched your product, you got some users, and now... Things are getting slower. Queries that used to run in 100 ms are now taking one second, and your team (which has spent the last six months writing application code) faces a whole new set of questions around PostgreSQL performance tuning and optimization.
Do I need more CPU or memory? Do I have to alter
max_parallel_workers? Are there other defaults I should start to fine-tune? What about my indexes and schemas?
First, congrats! These are great problems to have: they mean you’re scaling your PostgreSQL database.
Second, don't panic—and definitely don't start frantically Googling for a different database to use. PostgreSQL is the most loved database out there and the best foundation for your application, but if you’re scaling, it may need a little extra love.
This post is the first of a four-part series to help you fine-tune your PostgreSQL database for performance as your data and query volume grow. As often happens with databases, there aren’t foolproof guidelines we can assure will work without direct experimentation and observation: there are very few true invariants and always caveats. Fortunately, though, there are some principles you can systematically follow, and we’ll explore them in this series.
We’ll start by breaking down the relationship between your available database resources (CPU and memory) and performance. It may be tempting to fix every performance issue by throwing more resources (a.k.a. money) at the problem—don’t do this. That is an unsustainable solution that will raise your bill through the roof in the future. So, let’s discuss how to determine the right size for your PostgreSQL database.
Sizing Your PostgreSQL Database
There are three main components of the (usually virtual) machine that runs your database: memory (RAM), CPU, and disk I/O. This post will focus on CPU and memory.
CPU is the core work unit in your database. Operations such as data transfer between the database and client, index scanning, data joining, and the evaluation of WHERE clauses all rely on the CPU. Generally, given the absence of memory or disk constraints, PostgreSQL's read throughput scales in direct proportion to the number of available cores.
CPU is also the core unit of concurrency in your database. Although PostgreSQL can process more than one query per core while a process is waiting for data from disk or from the client, you should not count on your queries per second being any greater than one second divided by your average query runtime and multiplied by the number of cores in your system. Using this back-of-the-napkin math is a great way to pick an initial size for your database.
The idea is that if you know the average time a query takes to run and you know how many CPU cores you have, you can estimate the maximum number of queries your system can handle per second. This assumes each core can handle one query at a time and that other factors, like memory or disk access, are not the primary bottlenecks.
- If you anticipate that your system needs to handle 320 queries per second and know the average query takes 50 ms, you'd need roughly 16 cores (given the linear scaling assumption).
- Conversely, if you have a fixed number of cores available, like eight, and know your average query time, you can estimate the load your system can handle efficiently.
Remember, this is a back-of-the-napkin estimate. Real-world performance will vary due to many other factors, as we’ll cover in this series. In most cases, you can expect a higher QPS than what this formula provides.
If you believe you’ve sized your CPU properly and yet your database is slow, here are some signs that you might need to increase the number of cores on your machine:
- Observed CPU usage remains consistently high. You should always keep an eye on current and historical CPU utilization information to correlate that utilization with application activity. If you’re using Timescale, you can access this information via the Metrics panel in the console. Remember—increasing your CPU is not always the best solution to counter low performance. Your CPU usage may be maxed out because you’re using resources inefficiently, for example, due to misconfigured partitioning or missing indexes.
- Queries slow down when there is higher activity, even without more data. If most of your queries are fast during low-volume periods, but many slow down when you reach certain queries per second, that may be a sign that you need more cores.
- You are using analytical queries heavily (aka aggregates). All queries use CPU, but big joins, aggregates, and sequential scans use more (they may use multiple cores) and for longer. If you have queries that compute aggregates over large amounts of data, make sure that you have enough CPU to still serve your application’s requests. Timescale’s continuous aggregates also solve this problem—they will reduce your CPU requirements considerably while keeping your analytical queries very fast. Consider using them if you're a heavy user of aggregates.
It's always advisable to monitor actual system performance and adjust resources as needed. Here are a few tips on CPU monitoring:
wait_event is null in the
pg_stat_activity for a process, it’s just doing CPU work.
pg_stat_activity is a PostgreSQL view that provides information about the current activities of processes. The
wait_event column specifically tells you what a process is waiting for. If the value in this column is null, it generally indicates that the process isn't waiting on any I/O or lock operations but is actively processing and utilizing CPU. In simpler terms, a null value suggests that the query or operation is CPU-bound, meaning that the CPU is the primary resource being utilized by that particular process.
EXPLAIN_ANALYZE tells you how parallelism is used.
EXPLAIN ANALYZE command in PostgreSQL is used to obtain a detailed execution plan of a query, along with the actual runtime statistics. One key aspect of modern PostgreSQL versions is their ability to use parallelism to execute certain parts of a query. This means that a query can leverage multiple CPU cores to run faster.
When you run
EXPLAIN ANALYZE on a query, the output will indicate which parts of the query are being executed in parallel, how many workers (threads) are being used, and the efficiency of this parallel execution. This can be particularly helpful in optimizing queries and understanding their performance characteristics on multi-core systems.
Pgwatch2 has great tooling for breaking down the CPU usage of the components of your workload.
Pgwatch2 is a popular open-source monitoring tool designed specifically for PostgreSQL databases. One of its standout features is its detailed breakdown of CPU usage for different components and operations within your PostgreSQL instance. By using
Pgwatch2, you can gain insights into which parts of their workload are consuming the most CPU and potentially where optimizations can be made.
Now, it’s time to move to memory.
A PostgreSQL database utilizes memory in two primary ways, each governed by distinct parameters:
- Caching data and indexes. PostgreSQL maintains a cache of both raw data and indexes in memory. The size of this cache is determined by the
shared_buffersparameter. Data is stored in "pages," which are essentially chunks of data. This applies to both data files and indexes. The presence or absence of a page in memory determines if it's cached. While there are ample resources to delve into the intricacies of cache eviction algorithms and page structures, the key thing to remember is that PostgreSQL uses an LRU (Least Recently Used) system for cache eviction.
- Working memory for intermediate data structures. When processing queries, PostgreSQL often needs to create temporary data structures to aid in delivering results. This could be as part of join operations, sorting, or even generating aggregates. The amount of memory that can be allocated for these structures is managed by the
work_memparameter. For instance, consider the various join algorithms employed by PostgreSQL: many require memory for operations like sorting or creating a hash map. Similarly, aggregate functions like
DISTINCT, or queries using
ORDER BY(that don't use an index for sorting), may also create temporary data structures that occupy memory. If a join or hash needs more memory than your
work_memsetting, it will spill to disk, which will be much slower.
These two parameters (`shared_buffers` and
work_mem) are especially important. We’ll come back at them later on in this series, as they’re especially handy in understanding memory-related performance issues in PostgreSQL:
- The cache maintained in
shared_buffersis crucial for speeding up data access. A low cache hit ratio indicates that the database has to frequently read from disk rather than from the faster memory cache. Increasing the
shared_buffersmight help, but ensure it's balanced with the OS cache.
- If complex operations (like sorts or joins) frequently spill to disk because they exceed the configured
work_mem, performance will suffer. Increasing
work_memmay help, but you must be cautious here, too, as setting it too high could eventually consume all your memory, especially with many concurrent connections.
Additionally, it's worth noting that there are two layers of caching: the PostgreSQL memory cache, which is directly managed by the database and is the primary cache level where data and indexes are held, and the operating system cache. This is an auxiliary cache level managed by the operating system.
While PostgreSQL isn't directly aware of what's cached here, this layer significantly accelerates file-reading operations by holding onto files that have recently been read from disk.
As a general rule, if your running database has more memory, it has to read from (and sometimes even write to) disk less, which is a good thing. But extra memory is expensive. How can you know that you have to increase the amount of RAM on your machine?
- Queries have unpredictable performance. If a query is slow when run once and then fast on the second go, it could be a sign that the data it needs cannot stay in the cache consistently. However, there are other potential causes for this type of speed pattern (such as high query planning time, which is cached), so it's worth directly observing this difference by comparing
shared hitin your query plan.
- You have many big indexes. If all of your queries can be satisfied with index scans, and all of your indexes can fit in memory, you will likely get good enough performance even if the database has to go to disk to fetch the whole returned row. Many users are aware that more indexes on a table mean that inserts, updates, and deletes have to do more work, but it's also the case that in order to properly take advantage of additional indexes, you may need to increase the memory on your machine so they can stay in the cache.
- Your sorts are spilling to disk. As we’ve mentioned before, if you see an
External Mergenode while debugging slow queries, this means that the memory required for the operation exceeds the available
work_mem. As a result, PostgreSQL had to write its intermediate data structure to disk and then clean up at the end.
- You see a lot of
DataFileReadwait events in
pg_stat_activity. Anytime a query reads from disk, a wait event is triggered. Directly observing these in
pg_stat_activityor comparing the number of events over time can give you a clue that you may need to fiddle with your memory. Significant wait events are also a sign that sequential scans are happening, so make sure to eliminate the possibility that you're simply missing an index (or have written the query to make the use of the index impossible) before spending more money.
Lastly, here are a few extra tips on memory monitoring.
Measure the size of your indexes
Indexes can consume a significant portion of memory when loaded into the cache, particularly large ones. Knowing the size of your indexes helps you understand how much memory they might occupy and decide which indexes are most crucial. To do so, use the
pg_relation_size() function along with the index name:
Monitor index usage
An unused index not only consumes disk space but can also lead to unnecessary overhead during write operations. However, it's worth noting that only the blocks of an index used by queries will take up space in memory when loaded into cache.
You can regularly monitor which indexes are being accessed and which aren't via the
pg_stat_user_indexes view, which contains statistics about user-defined indexes. Specifically, the
idx_scan column will indicate how many times an index was scanned. A value close to or at zero over extended periods shows the index might be unused.
EXPLAIN ANALYZE can tell you whether something has been fetched from disk
EXPLAIN ANALYZE command in PostgreSQL provides a window into the query planning and execution processes. To get a more detailed view of memory interactions, you should employ
EXPLAIN (ANALYZE, VERBOSE, BUFFERS). This combination of flags furnishes insights into memory usage, buffer hits, and disk reads.
In the output, you'll encounter terms like
shared hit and
shared read. A
shared hit implies that the required data was found in the cache, while a
shared read suggests that the data had to be fetched from the disk. A higher ratio of hits to reads typically indicates better memory optimization.
It's a common misconception to associate sequential scans with disk operations and index scans with memory operations. In reality, a small table's sequential scan might be entirely memory-based, while an index scan could involve disk reads if the index or its related data isn't cached. The type of scan doesn't definitively indicate memory or disk usage; it's the hit-to-read ratio and the nature of the data being accessed that provide clarity.
Again, make use of
Querying this view can show which queries are waiting on memory-related events or are consuming significant resources. This can be especially useful to see if certain queries are consistently causing issues.
To Be Continued
In this article, we’ve started to cover the topic of optimizing and fine-tuning your PostgreSQL performance by addressing CPU and memory sizing, including tips on how to identify the need to resize your CPU or memory.