12 Things You Need to Know About Time-Series Data

12 Things You Need to Know About Time-Series Data

From PostgreSQL pro tips and example projects to critical features and database evaluation criteria, we’ve rounded up an all-in-one guide of tips and recommended resources to help you do more with your time-series data

We just finished #12daysofTimescale, a two week-long daily content countdown to celebrate the end of the year, share our most popular technical content and tips with developers far and wide, and shine a light on our amazing community.

To help you tackle new projects and improve existing ones, we've rounded up all of the featured content from our #12daysofTimescale countdown, distilling the key message into tips, advice, and/or recommendations. With topics ranging from ways to optimize your database performance and integrate with 3rd party tools to the things you need to consider when evaluating time-series databases, there's something for everyone - whether you're new to time-series or an experienced DBA.

The result: a cheatsheet of things you need to know about time-series databases, sourced from our internal teams and active developer community. (Some may be a refresher, while others may surprise you.)

12. "Big cloud" providers don't necessarily offer better products.

Resource: TimescaleDB vs. Amazon Timestream: 6000x higher inserts, 5-175x faster queries, 150x-220x cheaper

No one wants to start out with a database only to find it doesn't scale or suit their needs as apps and systems grow. As this post points out, time-series databases vary widely in terms of ingest speed, query latency, ease of use, reliability, and more.

We have a history of benchmarking time-series database performance, and we spent weeks analyzing Amazon Timestream insert performance, query speed, developer experience, and reliability – and the title says it all: based on our tests, TimescaleDB dramatically outperforms Amazon Timestream in every area.  

Check out the full post for detailed results, key database consideration criteria, and steps to reproduce the results and run your own benchmarks.

For Amazon Timestream vs. TimescaleDB benchmark highlights, check out Timescale Developer Advocate Ryan Booz's (@ryanbooz) epic thread.

11. Time-series data is great for financial services, from traditional stock markets to cryptocurrency.

Resource: Learn how to power a (successful) crypto trading bot with TimescaleDB

Read how Felipe - software developer and active TimescaleDB community member -  built his crypto trading bot - and netted 480x returns - using Tensorflow, Node.js, TimescaleDB, and machine-learning sentiment analysis models, the lessons he learned along the way, and his advice for aspiring crypto traders.

And, if you want to try your own crypto analysis, check out our Analyze Cryptocurrency Market Data tutorial (includes step-by-step instructions and 5+ sample queries).

Moreover, time-series isn't just a niche reserved for IoT, oil and gas, and finance; time-series data is everywhere, from tracking package delivery fleet logistics to monitoring systems and applications, predicting flight arrivals, and reporting air quality. (See our primer on time-series data to learn more about what makes time-series data unique.)

If you're not sure where to start or if time-series data applies to your scenario, our Developer Q & A series features community members sharing the awesome ways they’re using data to solve problems, improve processes, and, in the case of Felipe's crypto bot, turn a side project into a money-making machine.

10. Continuously optimizing your database insert rate is especially critical for time-series workloads.

Resource: Get our 13 tips to improve PostgreSQL Insert performance

With time-series data, changes are treated as inserts, not overwrites – and when you need to retain all data vs. overwriting past values, optimizing the speed in which your database can ingest new data becomes essential.

To help you improve your database performance and optimize for time-series scenarios, Timescale CTO Mike Freedman (@michaelfreedman) shares his top tips. You’ll get advice for vanilla PostgreSQL - like how to test I/O performance - and a few TimescaleDB-specific recommendations.

To see a few of the techniques in action, Timescale Developer Advocate Avthar (@avthars) demos 5 of his favorites and breaks down the factors that affect ingest rate in his 5 Ways to Improve Your PostgreSQL INSERT performance technical webinar.

For step-by-step demos and pro tips, watch 5 Ways to Improve Your PostgreSQL INSERT performance

10. Enabling compression dramatically reduces your storage costs, speeds up queries, and allows you to retain more data.

Resource: Learn about time-series compression algorithms

Compression algorithms: they’re not magic, but they can dramatically reduce your data storage costs and speed up your queries. Given the relentless nature of time-series data, where data piles up quickly, shrinking your data storage needs is even more critical.

Ajay, Timescale CEO and co-founder, joined forces with Josh - long-time Timescale Engineer - to dive the history of databases and deliver this in-depth analysis of popular time-series compression methods (delta-delta encoding, simple-8b, and beyond), and when and why you’d want to use certain types.

✨ Fun fact: We’ve built a combination of best-in-class compression algorithms into TimescaleDB to help our users get 90%+ storage efficiencies, save disk space, and get faster query results.

Resource: Follow our Time-Series Forecasting tutorial to get started.

Time-series forecasting alone is powerful. But, joining time-series data with other relational business data allows you to create more insightful forecasts about how your data (and business) will change over time.

To demonstrate how to get started, our step-by-step tutorial takes you through how to use R, Apache MADLib, Python, and PostgreSQL to:

🔎 Analyze your data

📈 Predict future sales

❄️ Plan for seasonal fluctuations

You’ll see how to apply both Holt-Winters and ARIMA time-series forecasting modeling methods, using a sample dataset and example queries to get you up and running quicksmart.

See all #12daysofTimescale tweets

7. If you select the right database, you can integrate with your favorite 3rd party and open-source tools.

Resource: See our favorite PostgreSQL extensions for time-series

With 20K+ extensions to choose from, we love PostgreSQL for its vast ecosystem and extreme extensibility. And, luckily, many extensions help you work more efficiently with time-series data without the hassle of switching to a whole new database.

But, where do you start?

To help you find options that might be right for you, we surveyed our internal team members and active community members to source our “must have” extension list, including a few less widely known - but useful - ones.

⭐️ Bonus: installation instructions and sample queries to show you how to get each extension, how it works, and what it allows you to do.

6. Database architecture, flexibility, and query language matter – and can vary widely.

Resource: Read how TimescaleDB and InfluxDB are purpose-built differently – and how this impacts performance

While our Amazon Timestream benchmarks demonstrate that choosing the right time-series database isn't as simple as choosing from the "big" cloud providers, our InfluxDB comparison demonstrates the importance of understanding your requirements, such as a query language, developer onboarding time, ecosystem, and fully- managed database options.

We report where InfluxDB outperforms TimescaleDB (low-cardinality queries), and use data to show why TimescaleDB is the better choice if you have high-cardinality datasets, want a flexible hosted database option, and/or don’t want to learn a proprietary query language.

FluxQL < SQL

Our goal: provide honest evaluation and results to help you - and all developers - choose the database that’s best suited for their needs. (For more on this topic, see our InfluxDB vs. TimescaleDB at-a-glance comparison, and our Outflux - migration to TimescaleDB from InfluxDB tutorial.)

⚖️  For more comparisons, see how TimescaleDB stacks ups vs. MongoDB, vanilla PostgreSQL, and other alternatives.

5. Grafana is extremely well suited to time-series, but there's a learning curve.

Resource: Watch Guide to Grafana 101: Getting Started with (awesome) Visualizations

Grafana is an amazing open source visualization tool (we love it at Team Timescale) and well-suited to common time-series scenarios, but there are a lot of features that you may not know how, when, or why to use.

To help you see how and why Grafana is ideal for time-series, Avthar (@avthars) demos how to build 6+ visualizations - from world maps to gauges - for IoT, DevOps, and more. You’ll see real examples and get the best practices, code samples, and inspiration you need to create your own (awesome) visualizations.

For even more Grafana pro tips, watch Avthar’s next session - Guide to Grafana 101: Getting Started with Alerts - to learn how to set custom notifications for the metrics you care about and connect to Slack, PagerDuty, OpsGenie, and other popular tools.

Grafana alert lifecycles and states cheatsheet from Guide to Grafana 101: Getting Started with Alerts

4. You can host your time-series data on your favorite cloud provider.

Resource: Managed Service for TimescaleDB - a multi-cloud, fully-managed service for time-series data - is  available on AWS, Azure, and GCP, with 75+ regions and 2,000 configurations

With time-series data, each data point is inserted as a new value, instead of overwriting the prior (i.e., earlier) value. As a result, time-series workloads scale much faster than other types of data, and you need a database that can will grow with you - without astronomical costs or compromised performance.

We launched Managed Service for TimescaleDB to solve this problem: allow developers to get the power of TimescaleDB, with worry-free operations and the ability to grow, shrink, and migrate workloads with ease.

In 2020, Managed Service for TimescaleDB got even better: now available in 75+ regions across AWS, GCP, and Azure, with fine-grained CPU/storage configuration options, to give developers everywhere ultimate flexibility and control.

👉 Try Managed Service for TimescaleDB for free (30-day trial).

See all #12daysofTimescale tweets. Editors Note: As of September 2021, "Timescale Cloud" is now Managed Service for TimescaleDB.

3. Interactive dashboards increase your time-series' data utility – and can save you from running one-off queries for stakeholders.

Resource: Learn how to use Grafana variables to make more interactive dashboard visualizations

This is the second time Grafana appears on our top content list, and for good reason: it can be tricky to build useful dashboards that allow you and your teammates to easily drill into your data.

In this short how-to, you’ll learn how to use SQL and Grafana’s variables feature to build your own interactive dashboards. We use the example of monitoring the real-time location of New York City buses, but the steps we detail will work for any time-series scenario.

GIF of New York City real-time bus location map, with cursor changing bus route and 'dots' for bus locations changing accordingly.
Grafana variables allow you to use a drop-down menu to select various options, no code modifications required.

To continue your journey to Grafana mastery, explore our visualizing time-series data in Grafana mega-tutorial.

2. You don't have to compromise on database quality or price - you have options.

Resource: Learn how we’re building a self-sustaining open-source time-series database (and business)

Picking the right technology for you goes beyond features and functionality; you (often) want to choose an option with an active and passionate community that allows you to tap into others' expertise, ask questions, and search a wealth of resources.

Over the last 12 months, we doubled-down on our community (TimescaleDB has the largest time-series developer community) and our cloud business, announcing all of our software features would be free forever.

Learn about our philosophy, from how we've granted more rights to users and abolished the notion of "enterprise" features to why our approach to licensing matters for the future of software innovation.

See full Twitter thread and community comments. Editors Note: As of September 2021, "Timescale Cloud" is now Managed Service for TimescaleDB.

1.  A relational database for time-series can infinitely scale

Resource: Learn how we're making multi-node TimescaleDB free and other ways we're investing in our community

The final thing we'd like to impart about time-series data: a relational database can scale-out across multiple machines – and that's exactly what we're doing with TimescaleDB 2.0.

TimescaleDB 2.0 is now available, making TimescaleDB the only multi-node, petabyte-scale database for time-series. And, in addition to multi-node, we’ve added new functionality and enhanced core features to give users more control and flexibility.

This is a major milestone for us, our community, and the industry as a whole – and we couldn’t have done it without our amazing internal teams, customers, and beta testers who submitted countless amounts of feedback, issues, and fixes.

See all #12daysofTimescale tweets

Watch our All Things TimescaleDB 2.0 YouTube playlist (6 videos) to get an overview of all new features, then dive into feature-specific videos, demos, and tips.

Wrapping up

To get started with TimescaleDB and put these resources and tips into practice, try our hosted database for free (30-day trial).

If you prefer to self-manage TimescaleDB, see our GitHub repository for installation options (⭐️ always welcome and appreciated!).

Lastly, join our Slack community to ask questions, get help, and learn more about all things time-series; our engineers and community members are active in all channels.

The open-source relational database for time-series and analytics.
This post was written by
10 min read
General
Contributors

Related posts