Top 8 PostgreSQL Extensions You Should Know About
PostgreSQL extensions are one of the best things about PostgreSQL. Extensions add extra functionality to your database by modifying and enhancing the execution of certain processes, helping you bypass limitations—such as maintaining good performance when your tables grow up to billions of rows—without the hassle of switching to a whole new database.
PostgreSQL extensions loaded into the database can function just like built-in features, and a rich ecosystem of extensions for all use cases has sprung up over the years. In this article, we’ll teach you how to list available extensions and share eight PostgreSQL extensions you should know about.
How to List Extensions in PostgreSQL
Before getting into our top extensions, let's first see how you can list the available extensions in PostgreSQL and know which ones your database already has.
To list the installed extensions, you will use the
\dx command in the psql command-line interface. When you enter this command, PostgreSQL will display a list of all the extensions currently installed in your database. It’ll look something like this:
List of installed extensions Name | Version | Schema | Description --------------+---------+------------+--------------------------------------------------------------------- pg_trgm | 1.4 | public | text similarity measurement and index searching based on trigrams postgis | 3.0.1 | public | support for geographic objects timescaledb | 2.0.0 | public | enables time-series queries hstore | 1.6 | public | data type for storing sets of (key, value) pairs uuid-ossp | 1.1 | public | generate universally unique identifiers (UUIDs) plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
namecolumn shows the extension name, which you can use to install it.
versioncolumn tells you the current version of the extension.
schemacolumn shows the default schema where the extension's objects will be created.
descriptionbriefly overviews what the extension does.
The SQL query being used behind the scenes for the
\dx command is actually fairly simple. To get the same results you can run this:
SELECT * FROM pg_extension;
This will give you a list of the extensions that have been installed into the current database, along with details such as the extension's name and the version.
Remember that these are only the installed extensions and don’t include the extensions that are available from your PostgreSQL contrib directory but not installed. To see which extensions are available, you can run:
SELECT * FROM pg_available_extensions;
Finally, for bonus points, some services like Timescale use an extension whitelist system which is an installable subset of the available extensions list. To see which extensions are available to be installed when your service uses the PostgreSQL Extension Whitelist, you can run:
SELECT pae.* FROM current_setting('extwlist.extensions') AS cs(e) CROSS JOIN regexp_split_to_table(e, ',') AS ext(allowed) JOIN pg_available_extensions AS pae ON (allowed=name) ORDER BY 1;
This list of extensions is all available to be installed by the current user compared to the available extensions list, some of which might not be available for installation.
Top 8 PostgreSQL Extensions You Should Know About
postgisfundamentally transforms PostgreSQL into a database system that can efficiently handle spatial data. It introduces additional data types such as geometry, geography, raster, and more, along with a suite of functions, operators, and indexing capabilities tailored to these spatial types.
With PostGIS, PostgreSQL gains the ability to perform complex location queries using SQL, effectively turning it into a powerful spatial database management system with high performance, a rich feature set, and robust capabilities.
One of the primary advantages of PostGIS is its ability to process geospatial data at the database layer, which is typically more efficient than processing at the application layer. This efficiency is particularly beneficial for IoT applications that require sophisticated tracking, routing, and location-based services.
To install PostGIS into your PostgreSQL database, run:
CREATE EXTENSION postgis;
PostGIS sample query
To illustrate how to use PostGIS, let’s share a sample query involving geospatial and time-series elements. This is a great example of how you might combine different Postgres extensions to meet your requirements (in this case, combining PostGIS and TimescaleDB—we’ll talk about TimescaleDB later).
We want to know: "How many taxis picked up passengers within 400 meters of Times Square on New Year's Day 2016?"
-- How many taxis picked up rides within 400 m of Times Square on New Years Day, grouped by 30-minute buckets? -- Note: Times Square is at (lat, long) (40.7589,-73.9851) SELECT time_bucket('30 minutes', pickup_datetime) AS thirty_min, COUNT(*) AS near_times_sq FROM rides WHERE ST_Distance(pickup_geom, ST_Transform(ST_SetSRID(ST_MakePoint(-73.9851,40.7589),4326),2163)) < 400 AND pickup_datetime < '2016-01-01 14:00' GROUP BY thirty_min ORDER BY thirty_min LIMIT 5;
The above query produces the following results:
-[ RECORD 1 ]-+-------------------- thirty_min | 2016-01-01 00:00:00 near_times_sq | 74 -[ RECORD 2 ]-+-------------------- thirty_min | 2016-01-01 00:30:00 near_times_sq | 102 -[ RECORD 3 ]-+-------------------- thirty_min | 2016-01-01 01:00:00 near_times_sq | 120 -[ RECORD 4 ]-+-------------------- thirty_min | 2016-01-01 01:30:00 near_times_sq | 98 -[ RECORD 5 ]-+-------------------- thirty_min | 2016-01-01 02:00:00 near_times_sq | 112
pg_stat_statements tracks statistics on the queries executed by a Postgres database. It will help you debug queries, identify slow queries, and generally give you deeper information about how your queries are running. The statistics gathered by the module are made available via a system view named pg_stat_statements.
To enable the extension, run:
CREATE EXTENSION pg_stat_statements;
pg_stat_statements sample query
To demonstrate how to utilize pg_stat_statements, let's look at a sample query that could help in analyzing database performance by identifying top I/O intensive SELECT queries.
We're interested in: "What are the top 5 I/O intensive SELECT queries?"
SELECT query, calls, total_time, rows, shared_blks_hit, shared_blks_read FROM pg_stat_statements WHERE query LIKE 'SELECT%' ORDER BY shared_blks_read DESC, calls DESC LIMIT 5;
The output would look like this:
-[ RECORD 1 ]--—+--------------------------------------------------- query | SELECT * FROM customer_data WHERE created_at > $1 calls | 500 total_time | 23000 rows | 500000 shared_blks_hit | 100000 shared_blks_read | 75000 -[ RECORD 2 ]-----+--------------------------------------------------- query | SELECT name, address FROM orders WHERE status = $1 calls | 450 total_time | 15000 rows | 450000 shared_blks_hit | 95000 shared_blks_read | 55000 -[ RECORD 3 ]-----+--------------------------------------------------- query | SELECT COUNT(*) FROM transactions WHERE amount > $1 calls | 300 total_time | 12000 rows | 300000 shared_blks_hit | 85000 shared_blks_read | 50000 -[ RECORD 4 ]-----+--------------------------------------------------- query | SELECT product_id FROM inventory WHERE quantity < $1 calls | 400 total_time | 16000 rows | 400000 shared_blks_hit | 80000 shared_blks_read | 45000 -[ RECORD 5 ]-----+--------------------------------------------------- query | SELECT * FROM user_logs WHERE user_id = $1 AND activity_date > $2 calls | 350 total_time | 17500 rows | 350000 shared_blks_hit | 75000 shared_blks_read | 40000
querycolumn shows the text of a representative statement.
callscounts how many times the statement was executed.
total_timeis the total time spent in the statement in milliseconds.
rowsshows the total number of rows retrieved or affected by the statement.
shared_blks_hitindicates the number of shared block cache hits by the statement.
shared_blks_readindicates the number of shared blocks read by the statement, which directly relates to the I/O load.
shared_blks_read fields are particularly important for I/O analysis.
shared_blks_hit is the count of times disk blocks were found already cached in memory (hence no I/O was needed), whereas
shared_blks_read is the count of times disk blocks had to be read into memory, which indicates actual I/O operations. High values in
shared_blks_read suggest that these queries are the most I/O intensive, which can be a starting point for performance optimization.
pgcrypto is an extension that enhances PostgreSQL by providing cryptographic functions right within the database system. This extension offers a variety of functions for hash creation, data encryption, and decryption, allowing you to perform secure cryptographic operations on your data within PostgreSQL.
With pgcrypto, PostgreSQL can create secure encrypted data storage and manage sensitive information like passwords, personal data, or financial details using standard algorithms like DES, 3DES, and AES. It's particularly useful when you want to enforce data encryption at the database level, adding a layer of security to guard against unauthorized data access.
One of the benefits of pgcrypto is that it allows the encryption and hashing of data using SQL queries, which can be more efficient and secure than handling this in the application layer. This is especially important for systems that require compliance with data security standards and regulations.
To enable pgcrypto in your PostgreSQL database, run:
CREATE EXTENSION pgcrypto;
Once the extension is created, you can use its functions in your SQL queries.
pgcrypto sample query
Here's how you might use pgcrypto to encrypt and decrypt data. Let's say you want to store encrypted user passwords. First, you would encrypt a password when inserting it into a table:
INSERT INTO users (username, password) VALUES ('john_doe', crypt('my_secure_password', gen_salt('bf')));
In this statement,
crypt is a function provided by pgcrypto that encrypts the password using the Blowfish algorithm, which is indicated by
Next, to authenticate a user, you would compare a stored password against one provided during login:
SELECT username FROM users WHERE username = 'john_doe' AND password = crypt('input_password', password);
pg_partman is an extension that simplifies the creation and maintenance of partitions of your PostgreSQL tables. Partitioning is a key database technique that involves splitting a large table into smaller, more manageable pieces while still allowing you to access the data as if it were one table. It is a very powerful way to keep your large PostgreSQL tables fast and manageable.
With pg_partman, PostgreSQL can manage partitions based on a variety of criteria such as time, serial IDs, or custom values. It eases the maintenance tasks associated with partitioning, such as creating new partitions in advance and purging old ones. This automation is particularly useful for large, time-series datasets that can grow rapidly over time.
To install pg_partman in your PostgreSQL database, you run:
CREATE EXTENSION pg_partman;
pg_partman sample query
Let's consider a scenario where you have a large table of IoT device data that you want to partition by day. Here's how you might set up a daily partition scheme for a table called device_data using
-- Create a parent table CREATE TABLE device_data ( time timestamptz NOT NULL, device_id int NOT NULL, data jsonb NOT NULL ); -- Set up pg_partman to manage daily partitions of the device_data table SELECT partman.create_parent('public.device_data', 'time', 'partman', 'daily');
In this setup, create_parent is a function provided by pg_partman that takes the parent table name and the column to partition on (time), as well as the schema (partman) and the partition interval (daily).
After setting up the partitions, pg_partman will help you manage them—but as mentioned previously, you’d like to check out Timescale’s hypertables for a fully-automated, worry-free partitioning solution.
postgres_fdw module enables you to use a Foreign Data Wrapper to access tables on remote Postgres servers (hence the name "fdw"). A Foreign Data Wrapper lets you create proxies for data stored in other Postgres databases so that they may be queried as if they were coming from a table in the current database.
Postgres_fdw allows you to combine data between two Postgres instances.
Here’s a sample use case:
- You have a Postgres instance (A), and you use postgres_fdw to access data on the Postgres instance (B), which is a remote instance.
- You then run queries combining data from instances A and B at the database rather than the application level.
Get postgres_fdw by running the below command from your psql command line:
CREATE EXTENSION postgres_fdw IF NOT EXISTS;
postgres_fdw sample query
Here’s how you create a connection to your foreign server:
CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '220.127.116.11', dbname ‘postgres’, port '5432');
This query creates a connection to a database hosted on IP address 18.104.22.168, with the name
port 5432. Now, create a user mapping so that users on your database can access the foreign server:
CREATE USER MAPPING FOR postgres SERVER myserver OPTIONS (user 'postgres', password 'password');
Once that’s done, you can import a schema from your foreign database and access any table.
CREATE SCHEMA schema1; IMPORT FOREIGN SCHEMA public FROM SERVER myserver INTO schema1;
Once imported, you can now access tables of the foreign database on your “local” database, like the example below, where we access the metrics table:
SELECT * FROM schema1.metrics WHERE time < now() - ‘2 days’ :: interval;
Remember that when the schema of the tables on the foreign server
myserver change your FDW may stop working and you’ll need to refresh the FDW by dropping and recreating the schema or reimporting the tables that have changed.
pgvector adds support for vector operations in PostgreSQL. It enables PostgreSQL to execute similarity searches, nearest-neighbor searches, and other complex operations on vector data. This can be particularly advantageous for applications like recommendation systems, image retrieval, and natural language processing tasks that rely on vector similarity calculations.
To integrate pgvector into your PostgreSQL setup, you would run the following SQL command:
CREATE EXTENSION vector; -- Note the extension name is actually vector, not pgvector
pgvector sample query
Suppose you have a database of image features extracted using a machine learning model, and these features are stored as vectors. You want to find the most similar images to a given feature vector. Here's how you might use pgvector to perform a nearest-neighbor search:
-- Assuming we have a table with image features stored as vectors -- Table: image_features -- Columns: id (integer), features (vector) -- Given a query vector, find the 5 most similar images SELECT id, features FROM image_features ORDER BY features <-> 'query_vector'::vector LIMIT 5;
In this query, the <-> operator is provided by pgvector and represents the distance operator, which calculates the distance between two vectors. The query_vector is the vector representation of the image features you are searching for. This query orders the results by the distance between the query_vector and the features column, effectively returning the closest matches
hstore is a key-value store within PostgreSQL, functioning as an extension that allows you to store sets of key/value pairs within a single PostgreSQL data type. It's designed to efficiently handle data that is not rigidly structured and can flexibly accommodate a varying number of attributes without altering the schema of your tables.
One of the key benefits of using hstore is the ability to index key/value pairs for faster search and retrieval, making it a good choice for semi-structured data or data with sparse attributes. It supports GIN (Generalized Inverted Index) indexing, which can speed up queries on keys and values within the hstore data.
To use hstore in your PostgreSQL database, you need to enable the extension by running:
CREATE EXTENSION hstore;
hstore sample query
Here's an example of how you might use hstore to store and query product data with varying attributes:
-- Create a table with an hstore column for storing product attributes CREATE TABLE products ( id serial PRIMARY KEY, name text NOT NULL, attributes hstore ); -- Insert a product with attributes into the table INSERT INTO products (name, attributes) VALUES ('Smartphone', 'color => "black", storage => "64GB", battery => "3000mAh"'); -- Query to find products with a specific attribute SELECT name FROM products WHERE attributes @> 'storage => "64GB"';
In this example, the @> operator is used to query the hstore column for products with a storage key and a value of "64GB." This kind of query is useful when searching for items based on a subset of their attributes.
pgpcre is a PostgreSQL extension that integrates Perl Compatible Regular Expressions (PCRE) into PostgreSQL. It provides advanced string-matching functionality and is particularly useful when PostgreSQL’s built-in regular expression capabilities are insufficient for complex pattern-matching requirements.
This extension is especially beneficial for applications that require sophisticated text analysis or processing, such as parsing logs, searching text, or validating string formats. One of the advantages of pgpcre over standard PostgreSQL text functions is the additional regular expression features it supports, which are not available in PostgreSQL's native regular expression functions. This includes but is not limited to advanced look-ahead and look-behind assertions, backtracking control verbs, and complex character class definitions.
To add pgpcre to your PostgreSQL database, you would execute:
CREATE EXTENSION pgpcre
pgpcre sample query
For example, if you want to search for email addresses in a column of unstructured text, you might use a PCRE pattern for matching emails as follows:
-- Assuming we have a table named messages with a column named content -- Table: messages -- Column: content (text) -- Use pgpcre to match email addresses within the content SELECT content, pcre_match('^\S+@\S+$', content) AS email FROM messages WHERE pcre_match('^\S+@\S+$', content) IS NOT NULL;
In this query, the
pcre_match function is part of the
pgpcre extension and is used to match and extract email addresses looking things from the text content of each row in the messages table. The regular expression pattern provided as the first argument to
pcre_match represents a simple email address format.
Is TimescaleDB a PostgreSQL Extension?
Yes! TimescaleDB is another top ⭐️ extension to know. It adds a query and ingest performance boost to PostgreSQL via automatic partitioning, query planner enhancements, improved materialized views, columnar compression, and much more, making it suitable for data-intensive applications like time-series data. It also includes libraries to simplify analytics and advanced functionality, such as a job scheduler.
If you're running your PostgreSQL database on your own hardware, you can simply add the TimescaleDB extension. If you prefer to try Timescale in AWS, you can create a free account on the Timescale platform. It only takes a couple of seconds, no credit card required.
TimescaleDB example query
Suppose we have a sensor data table that records temperature readings from various devices every minute. We want to analyze the average temperature per hour for a specific device. Here’s how you might construct such a query using TimescaleDB:
-- Assuming we have a hypertable named sensor_readings with columns time, device_id, and temperature -- Table: sensor_readings -- Columns: time (TIMESTAMP WITH TIME ZONE), device_id (INT), temperature (DOUBLE PRECISION) -- Calculate the average temperature per hour for device with ID 1 SELECT time_bucket('1 hour', time) AS one_hour_bucket, AVG(temperature) AS avg_temperature FROM sensor_readings WHERE device_id = 1 GROUP BY one_hour_bucket ORDER BY one_hour_bucket;
In this query, the time_bucket function provided by TimescaleDB is used to map individual readings into one-hour intervals or 'buckets'. Then, the AVG function calculates the average temperature for each interval. This query would be particularly efficient in TimescaleDB due to its optimized handling of time-series data.
Adding PostgreSQL Extensions to Your Databases in Timescale
We’re huge fans of PostgreSQL’s rich extension ecosystem here at Timescale, so, unsurprisingly, we tried to make access to it as simple as possible from our cloud platform user interface. The Timescale console lists all extensions available on your database service and how to enable them.
And by simple, we mean really simple. You only need to expand any section for the exact instructions on how to enable it.
Looking for a specific extension? Our extension search searches the extension's name and description to help you find what you’re looking for.
Can’t find the extension you’re looking for? You can make a new extension request very quickly from within the console itself. Hit the “Request an extension” button and fill out the name and description of the extension you’re looking for. Help us expand our very own PostgreSQL extension ecosystem!
PostgreSQL extensions are a great way to bypass PostgreSQL’s limitations and add new functionality to your database. In this post, we listed eight helpful PostgreSQL extensions, along with sample queries and install instructions. To learn more about these and other extensions, head here to see what you can do beyond the Postgres basics.
To make life easier for Timescale users, you can choose and enable several PostgreSQL extensions directly in the Timescale console—read our Docs for a full list of the extensions we support. If you want to learn the many other ways we make developers’ lives easier, check out Timescale, our mature cloud database for a fiercer and faster PostgreSQL. It’s free for 30 days, no credit card required.