What Is Time-Series Data? (With Examples)
What Is Time-Series Data?
Time-series data is a sequence of data points collected over time intervals, allowing us to track changes over time. Time-series data can track changes over milliseconds, days, or even years.
Having access to detailed, feature-rich time-series data has become one of the most valuable commodities in our information-hungry world. Businesses, governments, schools, and communities, large and small, are finding invaluable ways to mine value from analyzing time-series data.
(You can read how real-world teams, like those tracking real-time flight data or building platforms for smarter cities, analyze their time-series metrics in our Developer Q&A series).
Software developer usage patterns already reflect the same trend. In fact, over the past two years, time-series databases (TSDBs, or time-series DBMS—database management systems) have steadily remained the fastest-growing category of databases:
As the developers of an open-source time-series database, my team and I are often asked about this trend and how it should factor into your decisions about which database to select. Specifically, does it really matter if you start with a database specialized for time-series data—or can you easily transition to one later?
To answer those questions, let me start with a more in-depth description of what time-series data is and how you might benefit from using a time-series database, and leave you with a few ways to start exploring time-series data and performing your own analysis.
What Are Some Examples of Time-Series Data?
In the past, our view of time-series data was more static; the daily highs and lows in temperature, the opening and closing value of the stock market, or even the daily or cumulative hospitalizations due to COVID-19.
Unfortunately, these totals missed the nuances of how the underlying changes over time contributed to these static values.
Let’s consider a few examples.
Unveiling banking nuances with time-series databases
If I send you $10, a traditional bank database will debit my account and credit your account. Then, if you send me $10, the same process happens in reverse. At the end of this process, our bank balances would look the same, so the bank might think, “Oh, nothing changed this month.” But, with a time-series database, the bank would see, “Hey, these two people keep sending each other $10; there’s likely a deeper relationship here.” Tracking this nuance, our month-ending account balance takes on greater meaning.
Tracking environmental efficiency with time series
Next, think about an environmental value like mean daily temperature (MDT), the average of the high and low temperature for consecutive days at a location. Over the last few decades, MDT has been used as a primary variable to calculate buildings’ energy efficiency.
In any given week, MDT might only vary slightly from day to day in a location, but the contributing environmental factors could be changing drastically over that same period. Instead, knowing how the temperature changed each hour throughout the day, coupled with precipitation, cloud cover, and wind speed, could dramatically improve your ability to model and optimize energy efficiency for your properties.
Time-series analysis of COVID-19 hospitalizations
Likewise, while knowing the total number of COVID-19 hospitalizations per day in your community is valuable, that number alone isn’t very descriptive. For instance, the hospital might disclose daily numbers that show 20 hospitalizations on Monday and increase slightly throughout the week to 23 hospitalizations on Friday.
At first glance, it looks like a 15 % increase in hospitalizations this week. But, if we add detail to each of those records (and increase the frequency at which we collect them), we might see that it was a net increase of 3 patients. In reality, there were 10 people discharged, and 13 admitted, an increase of 65 % for new admissions over the last five days.
Tracking each aspect of patient data over time (e.g., patient age, admitted or discharged, days to recovery, etc.) helps us understand how we arrive at the daily counts, allowing us to better analyze trends, accurately report totals, and take action. In the case of total COVID-19 hospitalizations, the details behind this analysis impact public policy in the cities and towns where we live.
The financial sector is a typical example of time-series data usage: be it stocks, cryptocurrencies or other financial assets, time-series data allows you to see how prices changed over time and helps you spot trends. As an example, here’s a time-series chart showing you the intraday price changes of the Bitcoin cryptocurrency:
Time-series data allows you not just to know the current price of the asset but also how it changed in the past.
Internet of Things and sensor data
Whether you’re recording motor temperatures in factories, monitoring cannabis cultivation, or even using IoT data to control a nuclear fusion experiment, you are leveraging time-series data to make better decisions.
Once you have sensors that send data into your time-series database, you can create real-time dashboards and analyze historical data.
Imagine you maintain a web application. Every time a user logs in, you may just update a “last_login” timestamp for that user in a single row in your “users” table. But what if you treated each login as a separate event and collected them over time? With that kind of time-series data, you could analyze historical login activity, see how usage increases or decreases over time, bucket users by how often they access the app, and more.
Another example that has become vital to every IT group around the world: operational metrics for servers, networks, applications, environments, and more. This kind of time-series metric data is crucial to keeping the services we rely on running without interruption. By tracking the changes in each metric, IT departments can quickly identify problems, plan for capacity increases during upcoming events, and diagnose if an application update resulted in changed user behavior, for better or worse. (See how NLP Cloud monitors their language AI API.)
Web3 and blockchain data
In the past year, we’ve seen a surge in companies that use TimescaleDB to build web3 and blockchain tools. Blockchains are made of timestamped blocks and transactions. There are several types of data to be recorded to drive smarter decisions in the industry. Think of NFT transaction monitoring, blockchain exploration, mining analytics, or even criminal investigations.
These examples illustrate how modern time-series data differs from what we’ve known in the past. Time-series data analysis goes far deeper than a pie chart or Excel workbook with columns of summarized totals.
This detailed data doesn’t just include time as a metric but as a primary component that helps to analyze our data and derive meaningful insights.
And, there are many other kinds of time-series data. Still, regardless of the scenario or use case, all time-series datasets have three things in common:
- The data that arrives is almost always recorded as a new entry.
- The data typically arrives in time order.
- Time is a primary axis (time intervals can be either regular or irregular).
In other words, time-series data workloads are generally “append-only.” While they may need to correct erroneous data after the fact or handle delayed or out-of-order data, these are exceptions, not the norm.
Simply put, time-series datasets track changes to the overall system as INSERTs, not UPDATEs, resulting in an append-only ingestion pattern.
This practice of recording each and every change to the system as a new, different row is what makes time-series data so powerful. It allows us to measure and analyze change: what has changed in the past, what is changing in the present, and what changes we forecast may look like in the future.
You may also notice that some of these examples describe a common type of time-series data known as event data.
Why Do I Need a Time-Series Database?
You might ask: Why can’t I just use a “normal” (i.e., non-time-series) database?
The truth is that you can, and some people do. But there are at least two reasons why time-series databases are the fastest-growing category of databases today: scale and usability.
Add the best price-performance to this equation, and we believe these are some of the reasons why you should sign up for Timescale—you'll get a 30-day free trial, no credit card required!
But let's dive into these features.
Time-series data accumulates very quickly, and normal databases are not designed to handle that scale (at least not in an automated way). Traditionally, relational databases fare poorly with very large datasets, while NoSQL databases are better at scale (although a relational database fine-tuned for time-series data can actually perform better, as we’ve shown in benchmarks versus InfluxDB, versus Cassandra, and versus MongoDB).
On the other hand, time-series databases—whether relational or NoSQL-based—introduce efficiencies that are only possible when you treat time as a first-class citizen. These efficiencies allow them to offer massive scale, from performance improvements, including higher ingest rates and faster queries at scale (although some support more queries than others) to better data compression.
TSDBs also typically include built-in functions and operations common to time-series data analysis, such as data retention policies, continuous aggregate queries, flexible time bucketing, etc.
Even if you’re just starting to collect this type of data and scale is not a concern at the moment, these features can still provide a better user experience and make data analysis tasks easier. Having built-in functions and features to analyze trends readily available at the data layer often leads you to discover opportunities you didn’t know existed, no matter how big or small your dataset
This is why developers are increasingly adopting time-series databases and using them for a variety of use cases:
- Monitoring software systems: virtual machines, containers, services, applications
- Monitoring physical systems: equipment, machinery, connected devices, the environment, our homes, our bodies
- Asset tracking applications: vehicles, trucks, physical containers, pallets
- Financial trading systems: classic securities, newer cryptocurrencies
- Eventing applications: tracking user/customer interaction data
- Business intelligence tools: tracking key metrics and the overall health of the business
- And more
Once you begin to see more of the information your applications store as time-series data, you still have to pick a time-series database that best fits your data model, write/read pattern, and developer skill sets.
Although NoSQL time-series database options have prevailed for the past decade as the storage medium of choice, more and more developers are seeing the downside to storing time-series data separately from business data (most time-series databases don’t provide good support for relational data).
In fact, this poor developer experience was one of the driving factors in why we started Timescale. Keeping all of your data in one system can drastically reduce application development time—and the speed at which you can make key decisions.
Nowhere is this more evident than with the rise of numerous self-service business intelligence tools like Tableau, Power BI, and, yes, even Excel. Users struggle to make timely, business-critical observations when precious time-series data is kept separate from business data. Instead, users find that they need to rely on these third-party tools to mash up data into something meaningful.
There are many valid and good reasons to use these powerful tools, but being able to query your time-series data alongside meaningful metadata information quickly shouldn’t be one of them. SQL has been built and honed over decades to provide efficient ways of generating these valuable aggregations and analyses.
The bottom line is that knowing where your time-series data is and where you store it can dramatically impact your future success.
Now, It’s Your Turn: Start Analyzing Your Time-Series Data
If you’re convinced you need a time-series database or want to try it out, spin up a fully managed TimescaleDB instance—free for 30 days.
From there, follow our getting started guide to configure your database and execute your first query, then choose one of our fun tutorials to delve deeper into TimescaleDB:
You can also read stories from people who develop real-world time-series data applications:
- Using IoT Sensors, TimescaleDB, and Grafana to Control the Temperature of the Nuclear Fusion Experiment at the Max Planck Institute
- Processing and Protecting Hundreds of Terabytes of Blockchain Data: Zondax’s Story
- How NLP Cloud Monitors Their Language AI API
- More stories!
Have questions or want to learn more? Join our Slack Community and Forum, where Timescale engineers and community members are active in all channels.