Do More on AWS With Timescale: 8 Services to Build Time-Series Apps Faster

Do More on AWS With Timescale: 8 Services to Build Time-Series Apps Faster

You may know Timescale as the developer of TimescaleDB, the leading time-series and analytics database built on PostgreSQL. We’re also building Timescale, a cloud-native PostgreSQL solution for time series, events, and analytics that expands traditional cloud databases' boundaries by combining all the goodness of PostgreSQL with the flexibility of cloud-native architectures. And we’re building it on AWS.

With its extensive catalog of services, AWS is the industry leader in public cloud infrastructure—and the default choice for many developers on where to build their projects. One of the advantages of building in AWS is that you can mix and match its wide range of services and tools to architect your data infrastructure, avoiding the need to create these services from scratch and speeding up development time.

Timescale is built to serve developers working with time series, events, and analytics applications in AWS. You can integrate your Timescale databases seamlessly into your existing AWS infrastructure as your all-in-one datastore for relational and time-series data for your applications.

But don’t just take it from us. See what one of our customers, SquareRoots, says about building on AWS with Timescale:

"Timescale integrated seamlessly into our AWS data pipeline with AWS IoT Greengrass, AWS Kinesis, and AWS Lambda to help power our controlled environment agriculture platform."

Mark Thompson, Senior Infrastructure Engineer
Square Roots

A downside to AWS’s extensive range of tools is the paradox of choice. To help solve that problem, we’ll tell you about eight AWS services that Timescale customers love using with Timescale, ranging from tools to ingest data into your database and business intelligence tools to services for low-cost data archiving and tiering.

In particular, we’ll cover the pairing of Timescale with the following services:

  • Amazon VPC
  • AWS Lambda
  • AWS IoT Tools: IoT Core and IoT Greengrass
  • Amazon QuickSight
  • Amazon CloudWatch
  • AWS Managed Service for Apache Kafka
  • Amazon S3

Let’s get into it!

Amazon VPC

Virtual Private Cloud (VPC) peering is a method of connecting separate Cloud private networks. It makes it possible for the virtual machines in the different VPCs to talk to each other without going through the public internet—resembling a traditional network that businesses would previously operate in their own data center but with the benefits of using scalable cloud infrastructure.

Amazon VPC is the service that bridges your Timescale databases and the rest of your AWS infrastructure. VPC peering enables you to securely access data stored in Timescale from your existing cloud infrastructure without ever exposing your services to the public internet.

More specifically, this service creates a private network “peering” connection between your Amazon VPC(s) and your Timescale VPC(s), making it possible for both to speak to each other without going through the wider Internet.

The AWS VPS and the Timescale Cloud VPC linked by a private connection
VPC peering using Amazon VPC enables you to establish a private connection between Timescale and other elements of your AWS infrastructure, giving you maximum security and privacy

This is very useful for running a managed database with the utmost privacy. For example, you may be hesitant to use a managed service because you’re concerned about exposing your database to the public internet. VPC peering solves this issue, giving you a private connection between your database and the rest of your AWS infrastructure. With VPC peering, you can enjoy all the benefits of a managed database service in Timescale without compromising on the isolation you’d get in a self-hosted deployment in AWS.

VPC peering is useful for simple peer-to-peer connections, but it can also be used for more advanced deployments. For example, you can create multiple Virtual Private Clouds per service, meaning that you could set up separate VPCs for different applications—or your dev, staging, and production environments—each with its own set of security and access control preferences.

Finally, it’s worth noting that using VPC peering in Timescale is very inexpensive—it will only cost you $0.030/hr per connection (which comes out to around $20/month).

Want to learn more about Amazon VPC and Timescale? The following resources will tell you everything you need to know:

  • [Blog Post] VPC Peering: From Zero to Hero: A comprehensive guide on how VPC peering works and how to set it up in Timescale. This guide also includes information on how to peer Timescale with your own EC2 instance, AWS Lambda, and Amazon QuickSight (more info on these later in this post).
  • [Docs] VPC Documentation: Contains step-by-step instructions for setting up VPC peering in Timescale using Amazon VPC.

AWS Lambda

AWS Lambda is a popular serverless compute service that lets you run applications without worrying about provisioning or managing the underlying infrastructure. As a user working with AWS Lambda, you define event-based functions that will run your code in response to triggers.


As mentioned in the Amazon VPC section, AWS Lambda is one of the services in which you can use VPC peering to access and insert data into your Timescale databases.

Being serverless, AWS Lambda is a powerful tool to operate your data pipelines with almost no operational overhead and paying only for what you consume. You can also connect AWS Lambda with AWS API Gateway to expose your function as an API endpoint or automatically run the function periodically using AWS EventBridge or AWS SNS/SQS. It works with Go, Node.js, Java, or Python code.

Timescale customers often use AWS Lambda to route time-series data into Timescale—for example, using AWS Lambda together with edge runtimes for IoT like AWS IoT Greengrass.

Furthermore, AWS Lambda can be used for transforming, fetching, and performing other data operations on tables and hypertables in your Timescale databases.

The AWS Lambda dashboard
You can connect AWS Lambda to Timescale via VPC peering. The example above shows how you can directly query hypertables in Timescale using AWS Lambda and psycopg2, the popular PostgreSQL database adapter for Python

To learn more about AWS Lambda and how to use it in your next project, see the resources below:

  • [Tutorials] AWS Lambda Tutorial: Here’s a tutorial that walks you through how to create a data API for Timescale using AWS Lambda and AWS API Gateway, how to pull data from third-party APIs and ingest it into Timescale, and how to continuously deploy your Lambda function using GitHub Actions.
  • [Blog Post] AWS Lambda For Beginners: Overcoming the Most Common Challenges: Useful advice on navigating the trickiest parts of working with AWS Lambda, like adding external dependencies, overcoming the 250 MB package limit in containers, or how to set up continuous deployment.
  • [Blog Post] How to Peer Timescale With AWS Lambda: Navigate to the “Peering Timescale…” section, where you’ll find detailed instructions on establishing a successful connection between AWS Lambda and Timescale.

IoT Tools: AWS IoT Core and AWS IoT Greengrass

IoT is one of the most popular use cases for customers on Timescale. Here’s how Timescale can be used with AWS IoT solutions to build stellar IoT applications. We often hear about AWS IoT Core and AWS IoT Greengrass:

AWS IoT Core establishes a secure, bidirectional connection between your edge devices and your AWS infrastructure in a serverless manner. It supports the most common networking protocols (LoRaWAN, MQTT, and HTTPS), helping you manage your IoT fleet, which can get significantly complex once you start having thousands (or even millions) of devices.

A diagram of some of the most common networking protocols, AWS IoT Core, AWS Lamda, and Timescale Cloud
Timescale customers often send sensor data from their devices to tools like AWS IoT Core (to help them manage the connection between edge and cloud) and use services like AWS Lambda to store that sensor data in Timescale 

AWS IoT Greengrass is an edge runtime that helps you configure your IoT devices faster via pre-built modules and functionality. This service can be useful if you have a large fleet of devices performing some form of edge processing (like Lambdas or machine learning inference), if devices communicate with each other, or if you're operating with disrupted internet connectivity. AWS IoT Greengrass integrates with AWS IoT Core, but it also allows you to directly stream data to services like Amazon Kinesis or Amazon S3.

You can use these tools to build your IoT data architecture, storing your sensor data in Timescale. Our customers love using Timescale for IoT because of its performance at scale (think: real-time queries and dashboards over millions of data points), time-series functionality, seamless integration with data visualization tools and end-user systems, and great cost efficiency for high data volumes.

If you’re running an IoT use case, make sure you give Timescale a try (it’s completely free for 30 days) while checking out these resources:

Amazon QuickSight

Amazon QuickSight is a managed business intelligence (BI) tool that provides both easy-to-use visualizations and dashboarding to get insights from business analytics. It integrates with a wide range of data sources, including Timescale, via its PostgreSQL driver. It also provides a machine learning functionality for pattern and anomaly detection.

It’s another popular tool that Timescale Customers love to use with Timescale via VPC peering.

Amazon QuickSight dashboard
Amazon QuickSight is a powerful BI tool that is very popular among Timescale customers (Source: aws.amazon.com)

How to get started

Amazon CloudWatch

Timescale directly integrates with Amazon CloudWatch, so you can directly monitor your Timescale database services. Amazon CloudWatch provides a reliable, scalable, and flexible monitoring solution that’s easy to spin up in minutes, saving developers the burden of managing their own monitoring systems and infrastructure.

Most of our customers are using Timescale in production—mission-critical applications require close monitoring of your service metrics to ensure that your database operates efficiently and without interruption. This is where integrating Timescale with monitoring tools like Amazon CloudWatch can be extremely helpful, allowing you to set up alerts on your service metrics to get notified every time your memory surpasses a certain threshold or once your storage starts to get full.

Here’s a five-minute video that walks you through integrating Timescale and Amazon CloudWatch in a few simple steps:


To learn more, check out the following resources:

AWS Managed Service for Apache Kafka

Apache Kafka is a popular real-time event streaming service used for a wide variety of data-intensive applications. You can write your own producers to insert generated data into Kafka topics and subsequently write consumers to subscribe to those topics to receive all newly generated data.

The Kafka Connect framework enables you to easily stream data in and out of Kafka to and from other services and software using pre-written connectors. A popular connector is the JDBC connector which allows you to ingest data into PostgreSQL from a Kafka topic. Because each Timescale database is also a PostgreSQL database, you can use this JDBC sink to ingest data into Timescale.

Deploying and maintaining a Kafka cluster can be a monumental task requiring intimate knowledge of Kafka, Zookeeper, and various other tools like Kafka Connect. A good alternative is to use AWS Managed Service for Apache Kafka (or MSK for short), a feature of MSK is MSK Connect that allows you to deploy Kafka Connectors at scale.

Amazon S3

Amazon S3, also known as Amazon Simple Storage Service, is a highly scalable cloud object storage service that stores object data within buckets. It’s built to retrieve large volumes of data.

Unlike some of the services and tools mentioned above, no integration or dev work is required to use Amazon S3 with Timescale databases—you can tier data from a Timescale database to Amazon S3 right within a Timescale database itself!

Tiered Storage is a multi-tiered storage architecture engineered to enable infinite, low-cost scalability for your time series and analytical databases in the Timescale platform. By running one command on your Timescale database, you can transparently tier older, infrequently accessed data to this low-cost storage tier—for a flat price of $0.021 GB/month.

Amazon S3 is an important service for developers building cloud-native applications. It’s an object storage service with excellent durability, high availability, and virtually infinite scalability that allows you to store vast volumes of data at a lower cost than other AWS storage services, like EBS, via its consumption-based pricing. S3 is one of the most popular services in AWS (perhaps the most popular), and it’s widely used for data warehousing and archiving.

But building, integrating, and operating a separate data warehouse or data lake for your time-series data means more development work, complexity, and costs. With Timescale, moving data from the database to an object store is as simple as running a SQL command. You’ll pay only for what you store—no extra charge per query or data read.

To make the most of our Tiered Storage backend architecture, this is all you need to do:

  • Log in to Timescale.
  • In the Timescale console, in the Overviewtab, locate the Tiered Storagecard. Click on Enable tiered storage. Confirm the action.

Get Started Today

Now it’s your turn! Pick your favorites from the AWS tools and services list and apply them to your next time-series, analytics, or events project.

Do you have feedback or suggestions for more AWS tools and services we should cover next? Let us know in the Timescale Community Forum or on Twitter @TimescaleDB.

Ingest and query in milliseconds, even at terabyte scale.
This post was a collaboration between
11 min read
AWS
Contributors

Related posts