How to install TimescaleDB on AWS

How to install TimescaleDB on AWS

A walkthrough with hardware sizing recommendations

Original post by William Hoang

TimescaleDB is an open-source scalable SQL database built for time-series data, optimized for fast ingest and complex queries. It speaks “full SQL” and is correspondingly easy to use like a traditional relational database, yet scales in ways previously reserved for NoSQL databases. (More info in our Docs.)

In this post, we will walk you through how to set up TimescaleDB on Amazon Web Services (AWS).

You will learn how to:

  • Get up and running on AWS
  • Set up your EC2 instance
  • Install TimescaleDB
  • Configure PostgreSQL
  • Use TimescaleDB

Let’s get started!

Getting up and running with AWS

Head to https://aws.amazon.com/ and create your AWS account. Enter the required credentials and access your AWS console. You should see a similar page as the one below:

The AWS console!

On the top left, select the ‘Services’ tab and a panel of AWS services will be provided. Next select ‘EC2’:

Finding ‘EC2’ — under the ‘Services’ tab

Click on ‘Launch Instance’ to view the Amazon Machine Image (AMI):

Launching the ‘EC2’ instance to view the AMI

We will select the ‘Ubuntu Server 16.04 LTS’ image or later as it aligns with one of TimescaleDB’s Linux distributions at the time of writing. (See a list of Timescale Linux distributions here.)

Choosing an AMI

Next we will set up our instance type.

Note: In a production environment we’d recommend running a R4.4xlarge. We recommend this machine size because the inserts are a high-disk-throughput operation and so the expected throughput of the machine is important. (For more information, please see the expected throughput for each type, here.)
In addition, in production we’d recommend 2 EBS volumes:
- 1 for the WAL: GP2, 350 GB (Even though the WAL only needs ~10GB, we recommend a larger volume size for the higher IOPS)
- 1 for the Data: GP2, 5 TB

But for the purpose of this tutorial, we will select the ‘t2.micro’ instance and click ‘Review and Launch’ to jump ahead to the final step.

Choosing the instance type

Now we’re on the last and final step, where we review the AMI details before launching the instance. For this tutorial we are using Ubuntu Server 16.04 LTS and have the type set to t2.micro.

Since we are not doing any performance testing, it is best to select the minimal configuration machine settings, with the security set to default.

Upon a final check, click on ‘Launch’:

Reviewing our AMI details before we launch the instance

Next, you will be brought to the key pair page where we will encrypt and decrypt login information for the remote instance.

In this tutorial we will create a new key named ‘timescale. You may name your key accordingly or use an existing key if you have one. You can also choose ‘Proceed without a key pair,’ but we recommend that you secure the instance with a key.

Next, click ‘Download Key Pair’ and save it in a secure location on your local computer.

Then click ‘Launch Instance’ to start running the instance remotely:

Creating a key pair, downloading it locally, and launching the instance

To determine if the instance is running, you will see a similar page like the one below where it shows the instance being initialized.

Click on the instance ID to see the details:

The launch status page indicates if the instance is running or not

The instance details page will be displayed as shown below. Next we need to connect to our instance locally in order to install TimescaleDB remotely.

Click ‘Connect’.

Viewing our instance details

Then you will locate the command for SSH and copy the entire line into your clipboard.

Finding and copying the SSH command

For the example we have:

ssh -i “timescale.pem” ubuntu@ec2–54–69–235–152.us-west-2.compute.amazonaws.com

We will now go to our local file system and locate where we saved our .pem key pair file. (I am on a Mac and will open the Terminal Application and then locate the directory that contains the file.)

We will change the permission on the ‘timescale.pem’ file in order to access our instance next. Within your terminal at the location of the .pem file, execute:

sudo chmod 400 timescale.pem

If prompted, enter your local computer’s password. Next we will connect to our instance. (For more information on Amazon’s key pairs see here.)

Execute the SSH command that you copied in your AWS instance. For example: ssh -i “timescale.pem” ubuntu@ec2–54–69–235–152.us-west-2.compute.amazonaws.com

You should be logged into the remote Ubuntu instance you just setup.

We will now prepare our machine to install TimescaleDB!

Setting up your EC2 instance

For this section we will reference and walk through the TimescaleDB installation guide. Now that we are in our instance, we will add TimescaleDB’s Personal Package Archive via

sudo add-apt-repository ppa:timescale/timescaledb-ppa

and then enter your password, if prompted:

Next we will update our apt-get:

sudo apt-get update

Installing TimescaleDB

With the packages dependencies installed, we can set up TimescaleDB in our instance by executing:

sudo apt install -y timescaledb-postgresql-9.6

(If you have PostgreSQL 10, install timescaledb-postgresql-10 instead)

With TimescaleDB installed, we can configure PostgreSQL to work in our example at the end of this tutorial.

Configuring PostgreSQL

First, let’s check the version of PostgreSQL that we have installed:

psql --version

For TimescaleDB and our current tutorial, you should have version:

psql (PostgreSQL) 9.6.3

Next, go into the PostgreSQL configuration files by accessing the folder:

cd /etc/postgresql/9.6/main
ubuntu@ip-172–31–17–231:/etc/postgresql/9.6/main$

Change the permission settings for the postgresql.conf file:

sudo chmod 644 postgresql.conf

Now we have the ability to write to file. Using a text editor (e.g. ‘nano’) let‘s make some modifications:

nano postgresql.conf

Under the ‘CONNECTIONS AND AUTHENTICATION’ heading and within the ‘Connection Settings’ area (see below), uncomment the ‘listen_addresses’ attribute and add in the value ‘*’ to set all IP addresses to listen on:

The default value is ‘localhost’ so what you should now have is:

listen_addresses = ‘*’ # what IP address(es) to listen on;

Within the ‘RESOURCE USAGE’ section of the postgresql.conf file,

Find the ‘Kernel Resource Usage’ area (see below). Uncomment the ‘shared_preload_libraries’ attribute and add in ‘timescaledb’ as its value.

What you should have is:

shared_preload_libraries = ‘timescaledb’

Save the postgresql.conf file and exit out of your text editor.

Now we need to restart the instance of PostgreSQL by running:

sudo service postgresql restart

Then execute the following command:

sudo su - postgres

We can now use TimescaleDB.

Using TimescaleDB

Now that you have TimescaleDB installed and configured, let‘s walk through an example of running TimescaleDB on our remote AWS instance.

Note: The Postgres server needs to be running for the next step to work.

Within your instance, type in:

psql -U postgres

You should then be in the PostgreSQL instance:

postgres=#

Now we will walk through the steps of working with TimescaleDB as outlined in the ‘Getting Started: Setup’ page in the docs.

  1. Create a database named ‘tutorial’
CREATE DATABASE tutorial; 

2. Connect to the database

\c tutorial

Now you may connect to the database each time by executing:

psql -U postgres -h localhost -d tutorial

3. Initialize extension

CREATE EXTENSION timescaledb;

4. Create a hypertable

First, create a vanilla PostgreSQL table:

CREATE TABLE conditions (   
  time        TIMESTAMPTZ       NOT NULL,   
  location    TEXT              NOT NULL,   
  temperature DOUBLE PRECISION  NULL,   
  humidity    DOUBLE PRECISION  NULL 
);

Then, referencing the Timescale APIs, convert the table via the create_hypertable() function. This transforms our table into a hypertable.

SELECT create_hypertable('conditions', 'time');

5. Insert data into the hypertable

INSERT INTO conditions(time, location, temperature, humidity) 
  VALUES (NOW(), 'office', 70.0, 50.0);

6. Query data from the hypertable

SELECT * FROM conditions ORDER BY time DESC LIMIT 100;

Voila! You have set up TimescaleDB on AWS.

From our Sample Datasets you may load more data into TimescaleDB and play with some of our Basic Operations and API calls.

For some lengthier tutorials please check out Tutorials, which includes a walkthrough on using PostGIS with TimescaleDB, as well as this blog post: Analyzing Ethereum, Bitcoin, and 1200+ other Cryptocurrencies.


If you have any follow up questions, please email us or join our Slack group.

And if you’d like to learn more about TimescaleDB and support us, please check out our GitHub, star us, and let us know how we can help out.

Ingest and query in milliseconds, even at terabyte scale.
This post was written by
7 min read
Product & Engineering
Contributors

Related posts