Subscribe to the Timescale Newsletter

By submitting you acknowledge Timescale's  Privacy Policy.

What Is Time-Series Forecasting?

What Is Time-Series Forecasting?

As time-series data becomes ubiquitous, measuring change is vital to understanding the world. Here at Timescale, we use our PostgreSQL and TimescaleDB superpowers to generate insights into the data to see what and how things changed, but also when they changed—that’s the beauty of time-series data. But, if you have data showing past and present trends, can you predict the future? Cue in time-series forecasting.

In the simplest terms, time-series forecasting is a technique that utilizes historical and current data to predict future values over a period of time or a specific point in the future. By analyzing data that we stored in the past, we can make informed decisions that can guide our business strategy and help us understand future trends.

Some of you may be asking yourselves what the difference is between time-series forecasting and algorithmic predictions using, for example, machine learning. Well, machine learning techniques such as random forest, gradient boosting regressor, and time delay neural networks can be used to extrapolate temporal data, but they are far from the only available options or the best ones (as you will see in this article). The most important property of a time-series algorithm is the ability to extrapolate patterns outside of the domain of training data, which most machine learning techniques cannot do by default. This is where specialized time-series forecasting techniques come in.

There are plenty of forecasting techniques to choose from, and this article will help you acquire a basic understanding of the most popular ones. From simple linear regression models to complex and vast neural networks, each forecasting method has its own benefits and drawbacks.

Let’s check them out.

cryptocurrency time-series stats
The value of Bitcoin (BTC) as time-series data (source)

Applications of Time-Series Forecasting

Quite a few industries and scientific fields are utilizing time-series forecasting. Some of the most relevant ones include:


As you can see, the list is already quite long, but the truth is that anyone who has access to accurate historical data can utilize time-series analysis methods to forecast future developments and trends.

When Is Time-Series Forecasting Useful?

Even though time-series forecasting may seem like a universally applicable technique, there are some limitations that developers need to be aware of. Because forecasting isn’t a strictly defined method but rather a combination of data analysis techniques, analysts, and data scientists need to consider the limitations that the prediction models hold, as well as the data itself.

The most crucial step when considering time-series forecasting is understanding your data model and knowing which business questions need to be answered using this data. By diving into the problem domain, a developer can more easily distinguish random fluctuations from stable and constant trends in historical data. This will come in handy when tuning the prediction model to generate the best forecasts and even considering which forecasting method to use.

When using time-series analysis, some data limitations need to be considered. Common problems include generalizing from a single data source and difficulty in obtaining appropriate measurements and accurately identifying the correct model to represent the data.

What to Consider When You Do Time-Series Forecasting

There are quite a few factors associated with time-series forecasting, but the most important ones include the following:

  • Amount of data
  • Data quality
  • Seasonality
  • Trends
  • Unexpected events


The amount of data is probably the most important factor (assuming that the data is accurate). A good rule of thumb would be the more data we have, the better our model will generate forecasts. This also makes it much easier for our model to distinguish between trends and noise in the data.

Data quality entails some basic requirements, such as having no duplicates, a standardized data format, and for the data to be collected consistently or at regular intervals.

Seasonality means that there are distinct periods of time when the data contains consistent irregularities. For example, if an online web shop analyzed its sales history, it would be evident that the holiday season results in an increased amount of sales. In this example, we can deduce the correlation intuitively, but there are many other examples where analysis methods such as time-series forecasting are needed to detect such consumer behavior.

Trends are probably the most important information you are looking for. They indicate whether a variable in the time series will increase or decrease in a given period. We can also calculate the probability of a trend in order to make even more informed decisions with our data.

Unexpected events (sometimes also referred to as noise or irregularities) can always occur, and we need to consider that when creating a prediction model. They present noise in historical data, and they are also not predictable.

Overview of Time-Series Forecasting Methods

Below you can find a basic overview of several forecasting methods that we covered and the theory behind them:

Time-Series Decomposition

Time-series decomposition is a method for explicitly modeling the data as a combination of seasonal, trend, cycle, and remainder components instead of modeling it with temporal dependencies and autocorrelations. It can either be performed as a standalone method for time-series forecasting or as the first step in better understanding your data.

When using a decomposition model, you need to forecast future values for each of the components above and then add these predictions together to find the most accurate overall forecast. Some of the most relevant forecasting techniques using decomposition are Seasonal-Trend decomposition using LOESS, Bayesian structural time-series (BSTS), and Facebook Prophet.

Time-series decomposition refers to a technique that decomposes time-series data into the following four components:

  1. Trend
  2. Cycle
  3. Seasonality
  4. Remainder
Decomposition of a used car sales data set. Blue line chart over white background
Decomposition of a used car sales data set (source)

Decomposition based on rates of change

Decomposition based on rates of change is a crucial technique when it comes to analyzing seasonal adjustments. The technique constructs several component series, which, when combined (using additions and multiplications), result in the original time series. Each of the components has a certain characteristic or type of behavior, and they usually include:

  • Tt: The trend component at time t describes the long-term progression of the time series. A trend is present when there is a consistent increase or decrease in the direction of the data. The trend component isn’t constrained to a linear function.
  • Ct: The cyclical component at time t reflects repeated but non-periodic fluctuations. The duration of these fluctuations depends on the nature of the time series.
  • St: The seasonal component at time t reflects seasonality (seasonal variation). Such a seasonal pattern can be found in time series that are influenced by seasonal factors. Seasonality usually occurs in a fixed and known period (for example, holiday seasons).
  • It: The irregular component (or "noise") at time t represents random and irregular influences. It can also be considered the remainder of the time series after other components have been removed.

Additive decomposition
Additive decomposition implies that time-series data is a function of the sum of its components. This can be represented with the following equation:

yt = Tt + Ct + St + It

where yt is the time-series data, Tt is the trend component, Ct is the cycle component, St is the seasonal component, and It is the remainder.

Multiplicative decomposition
Instead of using addition to combine the components, multiplicative decomposition defines temporal data as a function of the product of its components. In the form of an equation:

yt = Tt * Ct * St * It

The question is how to identify a time series as additive or multiplicative. The answer is in its variation. If the magnitude of the seasonal component is dynamic and changes over time, it’s safe to assume that the series is multiplicative. If the seasonal component is constant, the series is additive.

Some methods combine the trend and cycle components into one trend-cycle component. It can be referred to as the trend component even when it contains visible cycle properties. For example, when using seasonal-trend decomposition with LOESS, the time series is decomposed into seasonal, trend, and irregular (also called noise) components, where the cycle component is included in the trend component.

Time-Series Regression Models

Time-series regression is a statistical method of forecasting future values based on historical data. The forecast variable is also called the regressand, dependent or explained variable. The predictor variables are sometimes called the regressors, independent or explanatory variables. Regression algorithms attempt to calculate the line of best fit for a given dataset. For example, a linear regression algorithm could try to minimize the sum of the squares of the differences between the observed value and predicted value to find the best fit.

Let’s look at one of the simplest regression models, simple linear regression. The regression model describes a linear relationship between the forecast variable y and a simple predictor variable x:

yt = β0 + β1 * xt + εt

The coefficients β0 and β1 denote the line's intercept and slope. The slope β1 represents the average predicted change in y resulting from a one unit increase in x:

An example of a linear regression model
Simple linear regression model example (source)

It’s important to note that the observations aren’t perfectly aligned on the straight line but are somewhat scattered around it. Each of the observations yt is made up of a systematic component of the model (β0 + β1 * xt ) and an “error” component (εt). The error component doesn’t have to be an actual error; the term encompasses any deviations from the straight-line model.

As you can see, a linear model is very limited in approximating underlying functions, which is why other regression models may be more useful, like Least squares estimation and Nonlinear regression.

Exponential Smoothing

When it comes to time-series forecasting, data smoothing can tremendously improve the accuracy of our predictions by removing outliers from a time-series dataset. This leads to increased visibility of distinct and repeating patterns that would otherwise be hidden between the noise.

Exponential smoothing is a rule-of-thumb technique for smoothing time-series data using the exponential window function. Whereas the simple moving average method weighs historical data equally to make predictions about the future, exponential smoothing uses exponential functions to calculate decreasing weights over time. Different types of exponential smoothing include simple exponential smoothing and triple exponential smoothing (also known as the Holt-Winters method).

A graphic showing time-series smoothing with different smoothing rates. Blue and black line graph over a white background
Stationary time-series smoothing using different smoothing rates (EWMA) (source)

ARIMA Models

AutoRegressive Integrated Moving Average, or ARIMA, is a forecasting method that combines both an autoregressive model and a moving average model. Autoregression uses observations from previous time steps to predict future values using a regression equation. An autoregressive model utilizes a linear combination of past variable values to make forecasts:

Thus, an autoregressive model of order p can be written as:

yt = c + ϕ1yt-1 + ϕ2yt−2 + ⋯ + ϕpyt−p + εt

where εt is white noise. This is like a multiple regression model but with delayed values of yt as predictors. We refer to this as an AR(p) model, an autoregressive model of order p.

On the other hand, a moving average model uses a linear combination of forecast errors for its predictions:

yt = c + εt + θ1εt−1 + θ2εt−2 + ⋯ + θqεt−q

where εt represents white noise. We refer to this as an MA(q) model, a moving average model of order q. The value of εt is not observed, which means we can’t classify it as a regression in the usual sense.

If we combine differencing with autoregression and a moving average model, we obtain a non-seasonal ARIMA model. The full model can be represented with the following equation:

y′t = c + ϕ1y′t−1 + ⋯ + ϕpy′t−p + θ1εt−1 + ⋯ + θqεt−q + εt

where y′t is the differenced series (more on differencing can be found here). The “predictors” on the right-hand side are a combination of the lagged values yt and lagged errors. This model is called an ARIMA( p, d, q) model. The parameters of the model are:

  • p: the order of the autoregressive component
  • d: the degree of first differencing involved
  • q: the order of the moving average part

The SARIMA model (Seasonal ARIMA) is an extension of the ARIMA model. This is achieved by adding a linear combination of seasonal past values and forecast errors.

A graph predicting taxicab pickups in Times Square, New York. Blue and red line chart over a white background
Predicting taxicab pickups in Times Square with TimescaleDB (source)

Neural Networks

Neural networks are also gaining traction regarding tasks such as classification and prediction. Recent studies have shown that a neural network is able to approximate any continuous function sufficiently well for time-series forecasting. While classical methods like ARMA and ARIMA assume a linear relationship between inputs and outputs, neural networks are not bound by this constraint. They are able to approximate any nonlinear function without prior knowledge about the properties of the data series.

Neural networks such as multilayer perceptrons (MLPs) offer multiple advantages that make them worth considering:

  • Robust to noise: Neural networks are not only robust to noise when it comes to input data but also robust in the mapping function. This can come in handy when working with data that contains missing values.
  • Nonlinear support: Neural networks are not bound to strong assumptions and a rigid mapping function. They are able to learn from new linear and nonlinear relationships continuously.
  • Multivariate inputs: Multivariate forecasting is supported because the number of input features is completely variable.
  • Multi-step forecasts: The number of output values is variable as well.

TBATS

A lot of time series contain complex and multiple seasonal patterns (e.g., hourly data containing a daily pattern, weekly pattern, and annual pattern). The most popular models (e.g., ARIMA and exponential smoothing) can only account for one seasonality.

A TBATS model can deal with complex seasonalities (e.g., non-integer seasonality, non-nested seasonality, and large-period seasonality) with no seasonality constraints, making it possible to create detailed, long-term forecasts. But there is also a drawback to using TBATS models. They can be slow when calculating predictions, especially with long time series.

TBATS is an acronym for some of the most important features that the model offers:

  • T: Trigonometric seasonality
  • B: Box-Cox transformation
  • A: ARIMA errors
  • T: Trend
  • S: Seasonal components

Conclusion

Time-series forecasting is a powerful method for predicting future trends and values in time-series data. Time-series forecasting may hold tremendous value for your business development if you have access to historical information with a time component. While there is a myriad of forecasting methods to choose from, most of them are focused on specific situations and types of data, which makes it relatively easy to choose the right one.

If you are interested in time-series forecasting, take a look at this tutorial about analyzing Cryptocurrency market data. By using a time-series database like TimescaleDB, you can ditch complex analysis techniques that require a lot of custom code and instead use the SQL query language to generate insights.

Ingest and query in milliseconds, even at terabyte scale.
This post was written by
10 min read
General
Contributors

Related posts