How to Build TimescaleDB on Docker

How to Build TimescaleDB on Docker

If you’re interested in running TimescaleDB on your machine, we offer easy-to-use installation packages in our docs, including pre-built Docker containers. But sometimes, as a developer, you may prefer to build the latest version of TimescaleDB directly from GitHub. For example, to test out a new feature or contribute to the open-source code. Regardless of your reasoning, building software from a repository can be somewhat of a daunting task to accomplish and can often come with caveats that may not be clear until you start diving in.

Within this blog post, my goal is to help make this process as simple as possible. Specifically, we will focus on how to build TimescaleDB on Docker. Running a codebase within an independent container has many advantages. These include but are not limited to savings in storage space, not being bound by the operating system of your local machine, and having the freedom to make any changes to the container's system without having to affect your own. Generally, using containers is just easier and cleaner.

With lots of content to cover, let’s go ahead and jump in!

Before you start

Before building TimescaleDB on Docker, we need to have Docker downloaded and installed. The best way to get started with docker is by visiting their getting started page and downloading Docker Desktop.

Once you have Docker downloaded and installed, you are ready to go!

Create a new container

The first step in our process is creating a container.

When building a container, you have to choose a Docker image as the base for the container's system. For example, when building TimescaleDB, many engineers on the Timescale team use Ubuntu as the Docker image of choice, this working on top of a Linux operating system. To set up TimescaleDB, you will need to work off of a Unix-like system, thus why we often choose Linux.

To create a new container with Ubuntu as the image, run the following commands:

docker run --name timescale-main-branch -it -p 127.0.0.1:5431:5432 ubuntu bash

A lot is happening in this little snippet of code so let’s break it down:

  • docker run is the command that triggers the creation of a new container.
  • The --name timescale-main-branch is a flag that allows us to specify the container’s name, in this case being “timescale-main-branch”.
  • Then, I used the flag -it to allow for direct access to the terminal of my container from my local terminal.
  • The -p 127.0.0.1:5431:5432 lets Docker know that I want to connect my local machines (host 127.0.0.1) 5431 port to the containers 5432 port (the default PostgreSQL port). By doing this, we can connect to the PostgreSQL 5432 port in the container from our local 5431 port. This port step can be beneficial if you want to interact with your TimescaleDB instance outside of your container (such as using any third-party tools).
  • Next, we give the Docker image information ubuntu.
  • And lastly, we use the bash command to specify that we want to open and use bash within the container.

After executing this command, you will be presented with a bash shell running inside the Ubuntu container.

Set up Ubuntu environment with necessary packages

Once we have the container created, we need to set up the environment to work with and build TimescaleDB successfully. Note that the following commands will need to run in the bash shell created by our docker run command.

First, let's make sure that everything in the containers system is updated:

apt-get update

Next, let’s get the necessary packages to install various versions of PostgreSQL:

apt-get install -y gnupg postgresql-common
yes | /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh

Then, let’s install our PostgreSQL instance. In this case, I use PostgreSQL 14. You can specify another version if you would like.

apt-get -y install postgresql-14 postgresql-server-dev-14

We then download some necessary packages for building TimescaleDB.

apt-get install -y gcc cmake libssl-dev libkrb5-dev git nano

Once you have all these packages installed successfully, you are ready to move on!

Clone TimescaleDB from GitHub and build the extension

Next, we need to pull in the TimescaleDB repository from GitHub and get TimescaleDB built and ready to add to PostgreSQL.

Let’s begin by cloning TimescaleDB from GitHub:

git clone https://github.com/timescale/timescaledb/

Once we do that, we can enter the repository and change our working directory  to the timescaledb folder:

cd timescaledb/

If you want to build a specific branch, this is where you would specify it. I want to build off the main one for this example, but you can use the following command if you're going to change the branch:

git checkout [name-of-branch]

Now, we can prepare the build system to create the extension:

./bootstrap

Then, we can build the extension and install it! (Note that if you are not using the root user, you may need to include the sudo command for the install step.)

# To build the extension
cd ./build && make

# To install. May need to include sudo if not the root user
make install -j

Once both commands run successfully, you should have TimescaleDB built and ready to add to PostgreSQL.

Before we move on to setting up our PostgreSQL instance, make sure to go back out to the root folder. You can do this by moving two directories up in the hierarchy like so:

cd ..
cd ..

Set up PostgreSQL for the TimescaleDB extension

We will continue our journey from the root container folder and set up our PostgreSQL database instance with TimescaleDB.

First, we want to take on the role of the postgres user.

su postgres

Set up and alter PostgreSQL configuration files using secure methods

Next, to set up a secure database and have the ability to connect to our PostgreSQL instance outside of the docker container, we are going to run the following set of commands which alter the PostgreSQL config files.

These first two commands drop all PostgreSQL clusters within the system to then rebuild them with specific parameters. This allows us to set up the postgres user with a password and thus connect to our database outside the container.

The first command, pg_dropcluster, drops the default PostgreSQL cluster on our system.

pg_dropcluster 14 main --stop

The second command, pg_createcluster, recreates the PostgreSQL cluster on our system but with some adjustments, let’s break them down.

  • The --auth-host=scram-sha-256 flag sets the PostgreSQL authentication type to scram-sha-256, this being the most secure authentication method provided by PostgreSQL. It requires a password to access the database.
  • The --auth-local=peer flag allows all local users to access the PostgreSQL instance. This will only apply to users within the docker container instance itself.
  • --encoding=uft8 limits the characters that can be used for passwords to only standard ones.
  • And lastly, the --pwprompt flag triggers a prompt for us to create a password for the current user, which in our case is postgres. This step will set us up well if we connect to our database outside of the container.
pg_createcluster 14 main -- --auth-host=scram-sha-256 --auth-local=peer --encoding=utf8 --pwprompt

Next, to alter our PostgreSQL config files, we need to start up our PostgreSQL instance.

service postgresql start

After the instance is started, we can connect to our PostgreSQL database.

psql -U postgres -d postgres

Once we are in, we need to set some of the system parameters, specifically listen_addresses and shared_preload_libraries. The first parameter, listen_addresses, lets PostgreSQL know that it should receive information from all IP addresses. The second parameter shared_preload_libraries, preload TimescaleDB as an extension. Without this step, you couldn’t add the TimescaleDB extension.

# allows the system to listen to all IP addresses, aka an IP outside of
# the container
alter system set listen_addresses to '*';

# loads TimescaleDB as a preloaded library
alter system set shared_preload_libraries to 'timescaledb';

Once we get the main configurations updated for our PostgreSQL instance, we can exit PostgreSQL to finish up our setup on the host-based authentication configuration file.

exit

The last step for setting up our PostgreSQL configuration is editing the “pg_hba.conf", or the PostgreSQL host-based authentication, file. We could do this programmatically (which we show below in the Dockerfile section); however, it can be useful to see how to edit the configuration files manually. We can run the following command to find where configuration files are and where the “pg_hba.conf” one is specifically.

psql -d postgres -c "SHOW hba_file;"

Likely, the file location will be “/etc/postgresql/14/main/pg_hba.conf”. However, use the results you got from the previous code snippet if they differ from ours.

With this location string, we will open the config file utilizing the downloaded nano library:

nano "/etc/postgresql/14/main/pg_hba.conf"

After running this code, the document should open within the terminal as shown. We want to navigate down to the host row and enter in values all all all scram-sha-256 as shown below.

GIF showing how to navigate down to the ‘# host’ row of the “pg_hba.config” file and replace the values “DATABASE USER ADDRESS METHOD [OPTIONS]” with “all all all scram-sha-256”.
Replacing host values in “pg_hba.config” file

Once we make these changes, we are ready to jump into PostgreSQL and officially add the TimescaleDB extension! Woohoo!

Add the TimescaleDB extension

The final and probably easiest step of the process is to add the TimescaleDB extension to our PostgreSQL instance! Now that we have TimescaleDB built and PostgreSQL ready to receive TimescaleDB, all that is left is to hook the two together.

Since we made some adjustments to the config files, we will have to restart PostgreSQL to commit all changes:

service postgresql restart

Once the server has restarted, we can enter our PostgreSQL database and create the TimescaleDB extension. Notice that I use psql -X to enter my database. The -X command tells PostgreSQL not to read the startup files, which helps avoid any potential problems.

psql -X

create extension timescaledb;

And there you have it! You now have a database with TimescableDB installed!

Automate Docker set up with dockerfile

Setting up your Docker instance manually can be great for specifying parameters; however, you may want to automate this process. To do that, we can place all of this code within a Dockerfile so that spinning up a TimescaleDB instance can be as simple as running a few segments of code.

First, we need to create a Dockerfile that can build a Docker image identical to the system setup that we performed earlier. You can use the following code to accomplish this:

FROM ubuntu
ARG DEBIAN_FRONTEND=noninteractive
ARG POSTGRES_PASSWORD
RUN apt-get update && apt-get install -y \
   gnupg \
   postgresql-common
RUN yes | /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
RUN apt-get update && apt-get install -y \
   postgresql-14 \
   postgresql-server-dev-14  \
   gcc \
   cmake \
   libssl-dev \
   libkrb5-dev \
   git \
   nano
RUN git clone https://github.com/timescale/timescaledb/
RUN cd timescaledb/ && ./bootstrap
RUN cd timescaledb/./build && make
RUN cd timescaledb/./build && make install -j
RUN cd ..
RUN cd ..
RUN su postgres
RUN pg_dropcluster 14 main --stop
RUN pg_createcluster 14 main -- --auth-host=scram-sha-256 --auth-local=peer --encoding=utf8
USER postgres
RUN service postgresql start && \
   psql -U postgres -d postgres -c "alter user postgres with password '${POSTGRES_PASSWORD}';" && \
   psql -U postgres -d postgres -c "alter system set listen_addresses to '*';" && \
   psql -U postgres -d postgres -c "alter system set shared_preload_libraries to 'timescaledb';"   
RUN sed -i "s|# host    .*|host all all all scram-sha-256|g" /etc/postgresql/14/main/pg_hba.conf
RUN service postgresql restart &&\
   psql -X -c "create extension timescaledb;"

Note that we used the sed utility in this code to programmatically update the config file, which we manually updated above. You could run this code above to accomplish what we manually did.

Save this file and name it “Dockerfile”.

To create the Docker image from this file, you only need to run the code below. Make sure to put in the correct file path for where you saved the Dockerfile, replace the holder name for the image with whatever suits you fancy, and update the password.

docker build -t name-of-image –-build-arg POSTGRES_PASSWORD=password ~/path-to-file/

Once we build the image, we can create container instances that are ready to use.

docker run --name timescale-main-branch-dockerfile -it -p 5433:5432 name-of-image bash

# once container is created start up the postgres server
service postgresql start

And voila, you have an identical container to the one we manually set up above!

Contributing code?

We love open-source technologies at Timescale and strongly believe in community development. Thus, we encourage you to -consider contributing to our codebase!

If you are using these steps for testing any contributions, make sure also to check out our instructions and guidelines around testing. Here you can find additional information about contributing to TimescaleDB, along with further instructions on how to set up and add tests to our testing suite.

Wrap up

Whatever your reasoning is for building TimescaleDB within Docker, hopefully, this guide will help you get there!

If you want to share the fun things you are working on or have any questions, check out our community Slack and our new and shiny Timescale Forum. ✨

One last reminder: while building a Docker container instance can be fun and helpful in checking out the latest changes in the main branch, sometimes using a pre-spun-up older image can be better. Make sure to check out our docs for information on how to quickly create a Docker container instance from our released versions.

And if you want us to take care of all the database management side of things, you can also sign up for Timescale Cloud. You will get free 30-day access, no credit card required.

Thanks for giving us your time folks.

Happy coding!

Ingest and query in milliseconds, even at terabyte scale.
This post was written by
9 min read
Engineering
Contributors

Related posts