Start supercharging your PostgreSQL today.
Written by Semab Tariq
When we think about working with both PostgreSQL and Python to create our apps, our immediate concern is how to interact with PostgreSQL using Python. For over 10+ years, we've been relying on a trustworthy adapter called psycopg2. It has served us well, offering a solid connection between PostgreSQL and Python.
However, the fast-paced technological advancements created a need for a more advanced version of psycopg2. This newer option provides enhanced features, flexibility, and improved features as compared to psycopg2. Some of these improvements include server-side binding, fetching all results from one or more executed queries, and support for binary communication.
So, in this blog, we'll explore the following:
Building a basic CRUD application in Python
A comparison of psycopg2 and psycopg3
Key improvements in psycopg3
How Timescale simplifies database deployment for developers
Developer FAQs
Interesting Fact! There is no psycopg3; it is rebranded to psycopg.
If you are only interested in the psycopg binary and start development straight away, we can do so by:
pip install "psycopg[binary]"
Note: The pip version 20.3
or greater is required, or upgrade your pip installation by using this command:
pip install –upgrade pip
If you prefer to build psycopg from source, you’ll need the following prerequisites:
C compiler
Python development headers (e.g., the python3-dev
package).
PostgreSQL client development headers (e.g., the libpq-dev
package).
The pg_config
program is available in the PATH.
Use the following command in the compile psycopg package:
pip install "psycopg[c]"
Before proceeding with the code, ensure that you have your PostgreSQL installed on the system and that it is configured to accept connections.
Follow this link to install PostgreSQL on your system: https://www.postgresql.org/download/linux/debian/
After your database is ready, create a new file and insert the following code into it.
import psycopg
# Connection details (replace with your own)
DATABASE_URL = "postgresql://postgres@localhost:5432/postgres"
def create_item(name, description):
with psycopg.connect(DATABASE_URL) as conn:
with conn.cursor() as cur:
cur.execute("INSERT INTO items (name, description) VALUES (%s, %s)", (name, description))
conn.commit() # Commit the changes to the database
def read_items():
with psycopg.connect(DATABASE_URL) as conn:
with conn.cursor() as cur:
cur.execute("SELECT * FROM items")
return cur.fetchall() # Fetch all results as a list of tuples
def update_item(id, name, description):
with psycopg.connect(DATABASE_URL) as conn:
with conn.cursor() as cur:
cur.execute("UPDATE items SET name = %s, description = %s WHERE id = %s", (name, description, id))
conn.commit()
def delete_item(id):
with psycopg.connect(DATABASE_URL) as conn:
with conn.cursor() as cur:
cur.execute("DELETE FROM items WHERE id = %s", (id,))
conn.commit()
# Creating/inserting the record
create_item("New Item", "This is a new item.")
items = read_items()
print(items)
# Updating the record
update_item(1, "Updated Item", "This item has been updated.")
# Deleting the record
delete_item(2)
Run the code via python3 <FILENAME>.py
.
Feature | Psycopg2 | Psycopg3 |
Code | Written in C | Written in Ppython |
Code compatibility | Code written for Psycopg2 might require adjustments for Psycopg3 | Psycopg3 is designed to be backward-compatible with most pPsycopg2 features |
Python version support | 2.7 and 3.7+ | 3.8+ |
Performance | Good | Generally faster due to improved memory management and optimizations |
Security | Prone to SQL injection vulnerabilities if not careful | Improved security with parameter binding and prepared statements |
Features | Basic database interaction functionalities | Enhances existing features and introduces new advanced capabilities, including asynchronous operations, pipeline mode, and server-side binding |
Error handling | Basic error handling mechanisms | Improved error handling with dedicated exception classes and context managers |
Documentation | Comprehensive, but might require more effort to navigate psycopg2 documentation: https://www.psycopg.org/docs/
| Updated and well-organized documentation with examples psycopg documentation: https://www.psycopg.org/psycopg3/docs/ |
For a detailed performance comparison, check out our psycopg2 vs. psycopg3 performance benchmark.
In psycopg, this functionality involves exchanging binary data, such as images or files, between Python and PostgreSQL. When passing a value via a standard %s
placeholder, the most suitable format (text or binary) is typically chosen, But the binary is the fastest. Other options that we can specify are:
%b
for binary
%t
for text
By default, data is returned in text format, but you can use binary=True
with the execute function when needed.
# Sending binary data to the database (using %s):
cur.execute("INSERT INTO images (data) VALUES (%b)", (image_data,))
# Retrieving binary data as is (default):
cur.execute("SELECT data FROM images")
binary_data = cur.fetchone()[0] # Returns binary data as bytes
# Retrieving data as text:
text_data = cur.fetchone(binary=False)[0] # Decodes binary data to text
A cursor is a tool used to communicate with a PostgreSQL database by executing SQL queries They are normally created by the connection’s cursor()
method.
Imagine you're conversing with a librarian through a special window. You speak into a microphone, asking questions about the books (SQL queries). The librarian, on the other side, listens carefully and retrieves the information you seek, presenting it through the window. This window acts as your cursor.
Now, let's delve deeper into this analogy. Just as in a conversation through the window, you have sequential access and control with a cursor. Each question you ask leads to a specific answer before you move on to the next, mirroring the controlled, sequential manner of communication. It's akin to having a structured dialogue where each query corresponds to a particular response, allowing you to process the results of your query one row at a time.
Moreover, a cursor is stateful, meaning it remembers where you are in the dataset. In our analogy, this is akin to the librarian remembering which questions you've asked and answered so far. This memory enables each new question to be informed by previous interactions, creating a cohesive and informed conversation.
Introduced in the 3.2 version of psycopg, raw query cursors provide developers with the ability to utilize PostgreSQL's native placeholders, such as $1
, $2
, etc., instead of the standard %s
placeholder typically used in parameterized queries.
from psycopg import connect, RawCursor
with connect(dsn) as conn:
with RawCursor(conn) as cur:
cur.execute("SELECT $1, $2", [1, "Hello"])
assert cur.fetchone() == (1, "Hello")
This feature in psycopg sends the query and parameters to the database separately, just like a chef getting the recipe and ingredients separately. The database itself combines them and executes the query, like the chef following the recipe.
Note: Server-side binding works for normal SELECT
and data manipulation statements INSERT
, UPDATE
, DELETE
, but it doesn’t work with SET
or with NOTIFY
.
PostgreSQL provides an alternative approach to handle SET
with set_config()
and NOTIFY
with pg_notify()
. See the example below:
conn.execute("SELECT set_config('TimeZone', %s, false)", ["UTC"])
conn.execute("SELECT pg_notify(%s, %s)", ["chan", "42"])
Note: In both server-side and client-side scenarios, you're limited to specifying parameter values (i.e., strings in single quotes). If you need to parameterize various parts of a statement, like a table name, you'll need to utilize the psycopg.sql module.
from psycopg import sql
cur.execute(
sql.SQL("INSERT INTO {} VALUES (%s, %s)")
.format(sql.Identifier('my_table')),
[10, 20])
Server-side binding in psycopg provides numerous benefits such as enhanced performance, increased security against SQL injections, and handling binary data
With server-side binding, if you include parameters in your SQL query, you can't send multiple statements separated by semicolons in a single execute()
call. This is because server-side binding needs to carefully match parameters to specific statements, which gets tricky with multiple statements in one go.
Use client-side-cursor to execute this, as mentioned in the following example:
cur = psycopg.ClientCursor(conn)
>>> cur.execute(
... "INSERT INTO foo VALUES (%s); INSERT INTO foo VALUES (%s)",
... (10, 20))
In psycopg2, if you run several commands that produce multiple results, it only shows the final result. However, psycopg provides a more efficient method to handle multiple results. After running a query, the first result is immediately available in the cursor and can be fetched using standard methods. To access subsequent results, you can use the cursor.nextset()
method.
cur_pg3.execute("SELECT 1; SELECT 2")
cur_pg3.fetchone()
(1,)
cur_pg3.nextset()
True
cur_pg3.fetchone()
(2,)
cur_pg3.nextset()
None # no more results
psycopg lets you speak to the database without interrupting your Python program's main thread. You can ask questions (queries) asynchronously, meaning they get sent off without pausing everything else. While you wait for the answers, your program can keep doing other tasks and won't get stuck waiting for the slow database reply.
This can be incredibly useful for improving the responsiveness and performance of your code, especially when dealing with long-running queries or frequent communication with the database.
psycopg offers asynchronous versions of Connection and Cursor objects, providing non-blocking methods for executing queries, fetching results, and managing transactions.
In psycopg2, the following loop demonstrates the use of the fileno()
and poll()
methods along with the Python select()
function for asynchronous operations with psycopg:
def wait(conn):
while True:
state = conn.poll()
if state == psycopg2.extensions.POLL_OK:
break
elif state == psycopg2.extensions.POLL_WRITE:
select.select([], [conn.fileno()], [])
elif state == psycopg2.extensions.POLL_READ:
select.select([conn.fileno()], [], [])
else:
raise psycopg2.OperationalError("poll() returned %s" % state)
In psycopg, we can use the following example to establish an asynchronous connection with a PostgreSQL database using psycopg, execute a query asynchronously, and manage the cursor accordingly.
async with await psycopg.AsyncConnection.connect() as aconn:
async with aconn.cursor() as cur:
await cur.execute(...)
psycopg provides enhanced capabilities over psycopg2 when dealing with COPY TO
and COPY FROM
operations, leveraging the PostgreSQL COPY
protocol, known for its efficiency in data loading and modification using SQL techniques.
The following code copies rows from the specified table to standard output and makes it ready for further processing:
with cur.copy("COPY (SELECT * FROM table_name LIMIT %s) TO STDOUT", (3,)) as copy:
With a copy operation, you can import data into the database from various Python structures like a list of tuples or any iterable sequence.
records = [(10, 20, "hello"), (40, None, "world")]
with cursor.copy("COPY sample (col1, col2, col3) FROM STDIN") as copy:
for record in records:
copy.write_row(record)
If data is already formatted in a way suitable for a copy, it can be loaded into the database using:
copy.write().
with open("data", "r") as f:
with cursor.copy("COPY data FROM STDIN") as copy:
while data := f.read(BLOCK_SIZE):
copy.write(data)
It's like copying and pasting chunks of the file into the database table.
Connection pooling was supported in psycopg2, although it wasn't as robust and advanced as the features offered by its successor, psycopg.
Install the required package to use connection pooling with psycopg:
pip install psycopg_pool
Here is an example of how you can acquire a connection from the pool, execute the queries, and, at the end, while the transaction is committed or rolled back, we return the connection to the pool so all pool resources can be released.
with ConnectionPool(...) as pool:
with pool.connection() as conn:
conn.execute("SELECT something FROM somewhere ...")
with conn.cursor() as cur:
cur.execute("SELECT something else...")
# At the end of the `connection()` context, the transaction is committed
# or rolled back, and the connection returned to the pool
# At the end of the pool context, all the resources used by the pool are released
If your application uses async code, you can use the AsyncConnectionPool to achieve connection pooling:
async with AsyncConnectionPool(...) as pool:
async with pool.connection() as conn:
await conn.execute("SELECT something FROM somewhere ...")
with conn.cursor() as cur:
await cur.execute("SELECT something else...")
psycopg2 doesn't officially include Pipeline mode, but psycopg from version 3.1 onwards introduced it. This mode allows sending multiple queries to the PostgreSQL server without waiting for each one to complete before sending the next, which can enhance performance for applications with numerous small, frequent queries.
In psycopg, Pipeline mode is supported through the connection.pipeline()
method.
Here is an example of how you can execute multiple queries in pipeline mode with conn.pipeline()
:
... conn.execute("INSERT INTO mytable VALUES (%s)", ["hello"])
... with conn.cursor() as cur:
... cur.execute("INSERT INTO othertable VALUES (%s)", ["world"])
... cur.executemany(
... "INSERT INTO elsewhere VALUES (%s)",
... [("one",), ("two",), ("four",)])
Feature | Pipeline mode | Asynchronous |
Sending queries | Batch | Individual |
Receiving results | Batch, potentially out of order | Individuals, in order |
Execution | Synchronous | Non-blocking with await |
Best suited for | Many small, independent queries | Longer-running queries, concurrent requests |
Setting up databases can be tough and time-consuming for application developers. It takes a lot of effort and time to configure everything correctly. But, with Timescale, things get much easier. With just a few clicks, developers can set up their database servers within minutes.
This means less hassle and more time for them to concentrate on their application development.
Want to learn more about how to build your application using PostgreSQL/Timescale? Be sure to check the following articles—we’re taking you from data migration to monitoring your database.
Timescale provides a 30-day free trial period, allowing you to create up to three services without incurring any charges during this initial 30-day usage period. To begin with, sign up for Timescale.
Once you've signed up and verified your email, you are all set to create your first service.
Currently, there are two services available:
Time Series and Analytics: PostgreSQL, but faster. Built for lightning-fast ingest and querying of time-based data. Perfect for sensor data, financial data, events, or any other dataset indexed by timestamp or incrementing key. Optimized for performance and cost-efficiency at scale.
Dynamic PostgreSQL: PostgreSQL with the benefits of serverless but none of the problems. Perfect for storing metadata, business information, customer records, or any other traditionally relational workload.
Note: Timescale offers support for connection pooling through pgbouncer. During the service launch, you can choose whether to turn on or off connection pooling as required.
Choose the service you're interested in. After submitting all the details and completing the service creation, your database server should be ready in a couple of minutes. Once it's done, you will see the psql commands to connect to your database. You can also download a file in plain text containing all the details about your service, including your username and password. Ensure that you save this file in a secure location.
Let's see how the above code will be updated to work with our tsdb database hosted on Timescale.
Modify only the DATABASE_URL
, keeping the rest of the code unchanged.
DATABASE_URL = "postgres://<USERNAME>:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require"
Refer to the <SERVICENAME>-credentials.sql file you downloaded earlier for all the necessary information.
Check out the Timescale documentation to learn more about Timescale cloud services.
What kinds of applications can be developed using Python?
Python can be used to develop a wide range of applications:
Web applications
Data analysis tools
Artificial intelligence and machine learning applications
Automation scripts
Desktop GUI applications
Network servers, and more.
I installed psycopg via pip, but I am still facing a Python import error. Make sure that you are using the correct Python environment. Ensure that you've activated the virtual environment or are working in the correct Python environment where psycopg is installed. Also, ensure that you install the psycopg package and run your application using the same user to avoid potential issues.
Can I use Docker for my development? Yes, you can use Docker for your development. Docker provides a containerization platform that allows you to package your application and its dependencies into a container. This ensures consistency across different environments and makes it easier to manage dependencies. Docker containers can be easily deployed and run on various systems, providing a consistent development and deployment environment for your applications.
How do I hide database credentials? To hide your database credentials in Python, you can use environment variables or a configuration file.
Can I use PostgreSQL for web and mobile app development? Yes, PostgreSQL is well-suited for both web and mobile app development. It provides features like reliability, scalability, support for complex queries, and a wide range of data types. So, it can easily be used to build web and mobile applications.
Can I store JSON data inside the PostgreSQL? Yes, PostgreSQL supports storing and querying JSON data. It has native support for JSON and JSONB (binary JSON) data types, allowing you to store, retrieve, and manipulate JSON documents within the database. Learn more.
What is the difference between Rest API and CRUD? CRUD is a set of operations (Create, Read, Update, Delete) for data manipulation, while REST API is an architectural style for web services design that can include CRUD functionality.