Time-Series Analysis: Function Examples

Time-Series Analysis: Function Examples

As the creators of a database with unique time-series functionality, we have explored the many aspects of time-series data, explaining what it is, how you can forecast trends with it, and perform meaningful time-series data analysis.

Hopefully, your wheels are turning by now, and you’ve started to identify applications or areas in your business that have time-series data just waiting for you to do something with it. So, now what?

This is when the fun (and real work) begins. It’s also when you’ll really see why time-series databases are essential tools.

Let’s look at an example based on a fictional web application. Until now, we’ve only tracked the last time a user logged in as a field in the “users” table and always updated the previously stored value with the new login information.

While this allows us to query how many people have logged in over a week or a month, we’re unable to analyze how often they log in, for how long, or drill into any other aspects that might tell us more about our users’ experience or their usage patterns.

We can quickly improve upon this by tracking information about every login, not just the most recent one. To do this, we’ll start logging the timestamp of each login and the type of device used to access our application (e.g., phone, tablet, desktop).

This small change—tracking just one more property about the user login experience—provides immediate value, allowing us to answer questions like, “What kind of devices are most frequently used (by individual users and across all users)?” and “What time of day are users the most active?”.

From there, we can better inform the features we prioritize, such as mobile-specific capabilities, the times we display certain promotional messages, and beyond.

user_id first_name last_name
1 Mary Smith
2 Eon Tiger
3 Ajay Kulkarni

User_logins Table

user_id login_timestamp device_type
1 2020-11-08 11:15:00 mobile
2 2020-11-10 15:34:00 mobile
1 2020-11-11 12:13:00 desktop
3 2020-11-15 02:47:00 tablet
... ... ...

How Time-Series Databases Can Help Data Analysis

With the updated data model and these new user details logged, we can start to query the data for insights. Time-series databases like Timescale help with this kind of information in two crucial ways:

  1. As your application scales and data volume grows, your database is built to handle and ingest the relentless stream of data inherent to time-series workloads, mitigating any negative performance impacts or lags.
  2. They provide specialized functions that make it easier—and faster—to query aspects of your data in meaningful ways where time is a primary component.

To demonstrate some of those specialized time-series analysis capabilities, let’s look at a few example functions that Timescale adds to the SQL language and how we can use them to better analyze our users’ usage behavioral patterns. (For more examples, read about Timescale hyperfunctions.)

In each example, we’re still relying on standard SQL patterns, a language many developers are familiar with, and augmenting for time-series use cases. WHERE clauses still work, and we can still aggregate data easily with GROUP BY clauses.

But now, rather than parse out specific parts of the dates to group the data appropriately (for instance), we can use a function like time_bucket() to easily aggregate data across almost any interval.

And as a bonus, it also makes the query easier to read!

Time-Series Analysis Query #1: How Many Daily Logins for the Last Month?

SELECT
  time_bucket('1 day', login_timestamp) as one_day,
  COUNT(*) total_logins
FROM user_logins
WHERE login_timestamp > now() - INTERVAL '1 month'
GROUP BY one_day
ORDER BY one_day;

This first example is the “Hello, World!” of time-series queries, using the time_bucket() function to automatically group and aggregate our time-series data to help us get a quick view of total daily logins ( 1 day  in the function above) for the last month (WHERE login_timestamp > now() - INTERVAL ‘1 month’).

Notice that time-series queries allow you to specifically query time intervals rather than breaking down dates into each component (month, day, year, hour, etc.) to do a similar aggregation without these specialized functions.

Time-Series Analysis Query #2: What Was Each User's Last Login Time, and What Device Did They Use?

SELECT
  user_id,
  first_name || ' ' || last_name AS full_name,
  last(login_timestamp, login_timestamp) AS last_login,
  last(device_type, login_timestamp) AS last_device_type
FROM user_logins ul
INNER JOIN users u on ul.user_id = u.user_id
WHERE login_timestamp > now() - INTERVAL '1 month'
GROUP BY user_id, full_name
ORDER BY user_id;

In this more complex example, we use another specialized function, last(), to query useful information about our users, specifically the most recent value of a specific set of data.

Without a specialized function like last(), we would need to write a query with something like a LATERAL JOINor a correlated subquery. But, with our handy built-in specialized function, we can get this type of valuable information in a straightforward (and often swift) way.

Time-Series Analysis Query #3: For the Last Week, Which Six-Hour Periods Saw the Most Logins From Users on Tablet Devices?

SELECT
  time_bucket('6 hours', login_timestamp, timestamptz ‘2020-01-01 08:00:00’) as device_bucket,
  device_type,
  count(*) AS logins_by_device
FROM user_logins
WHERE login_timestamp > now() - INTERVAL '1 week' 
  AND device_type = 'tablet'
GROUP BY device_bucket, device_type
ORDER BY logins_by_device desc;

In this final example query, we demonstrate how functions like time_bucket() aren’t bound to common intervals (‘1 hour, ‘1 day’, ‘1 week’, etc.) but can be used for INTERVAL grouping. And more notably, we can combine these functions with parameters that allow us to refine our results to a specific subset.

In this case, we asked Timescale to return results in six-hour buckets, aligning the first bucket to 8 a.m. UTC and only returning logins from tablet-based sessions.

These examples just scratch the surface; you have infinite flexibility in how your data can be queried and modeled.

In summary, logging just two additional details about user logins—device type and timestamps for every login, not just the latest—quickly transforms our ability to understand how our web application is used and how time-series databases like Timescale help us analyze and make sense of data so that we can make decisions faster.

Start Analyzing Your Time-Series Data

If you need a time-series database or want to try it out, spin up a fully-managed Timescale instancefree for 30 days.

From there, follow our getting started guide to configure your database and execute your first query, then choose one of our fun tutorials to delve deeper into Timescale:

You can also read stories from people who develop real-world time-series data applications:

Have questions or want to learn more? Join our Slack Community and Forum, where Timescale engineers and community members are active in all channels.

Ingest and query in milliseconds, even at terabyte scale.
This post was a collaboration between
5 min read
time-series data
Contributors

Related posts