Most Recent Average

khash · October 5, 2022, 4:48pm

Say I am storing CPU readings in a table, with a simple server_id, usage_idle and created_at (time).
I also have an index on server_id, created_at

Running the following query will give me the average of the CPU value for 5 minute time buckets:

SELECT server_id, time_bucket('5 minute', created_at) AS bucket, avg(usage_idle) AS avg_idle
FROM cpu
GROUP BY server_id, bucket
ORDER BY bucket DESC

How can I return the value of the last time bucket for each server? Having that, I can then create a continuous aggregate to hold and update those values and run regular queries against them for alerting and other purposes.

Note: I’m not married to the idea of using time buckets. Having an efficient way to get the latest rolling average of usage_idle across thousands of server is also ok, as long as I can define the rolling average by time period, and not the number of readings (the readings are on a regular frequency, but not guaranteed)

James · October 11, 2022, 11:16am

Hi Khash - sorry about the delay!

Correct me if I’m wrong, but are you just looking for {server_id, avg(usage_idle)} for the previous 5 mins?

If this is what you want you could do the following:

SELECT server_id, avg(usage_idle) AS avg_idle
FROM cpu
WHERE created_at > now() - interval '5 mins'
GROUP BY server_id;

However this is constantly moving, so you’ll get a different result each time you run it. What I think you might be asking for is to get the results for the last complete 5 min time bucket.

So if the current time is 5:32 then you would want the bucket from 5:25-5:30 (as the 5:30 - 5:35 bucket is still collecting data). If that’s the case then you could use something like:

SELECT 
     server_id,
     time_bucket('5 minute', created_at) AS bucket,
     avg(usage_idle) AS avg_idle
FROM cpu
WHERE 
    -- get the last complete time bucket only
    time_bucket('5 minute', created_at) = time_bucket('5 minute', now()-interval '5 min')  AND
   -- don't scan all the partitions and data to find it
    created_at > now() - interval '10 min'
GROUP BY server_id, bucket
ORDER BY server_id;

khash · October 11, 2022, 11:20am

Thank you James. A couple of questions:

Does this mean I don’t need continuous aggregates and can just use the time_bucket function in the WHERE clause of a query on the table?
How’s the efficiency of this query (I can see that you added the backstop to avoid scanning all partitions), since we’re going to need to run this for a lot of servers, many many times a minute. What are the pointers to make it efficient for this type of query.

In other words, is it better (from a performance point of view), to create a continuous aggregate on the main table with a 5 minute bucket and fetch the last row or run something like your second suggested query every time, when the requirement is to get the average value of the metric for the most recent 5 minute window?

James · October 14, 2022, 3:31am

Hi again,

The answer is “it depends”. If you’re running the same query many-many times a minute for many server values caggs are a great way to go. Having said that, the query above will perform because it’s only looking at the most recent partition. I’d imagine this index would be permanently be in memory (because you’re accessing it so much).

I would start by evaluating the cost and runtime of a query like the one above on your data, and then make a call about it based on that information. A lot of it will come down to the cost of the avg function (ie: how many data points are you operating on within that 5 mins). I’m guessing it’s not many (maybe one per 10 seconds or something?).

khash · October 14, 2022, 9:47am

Thanks James.

At this point, our main goal is to assess the best DB for our requirements. So far we really like TimescaleDB for the PG compatibility and the ease of use, however the support is becoming a point of concern.

In the example above, almost any query (not going through and caggs) with all the right indices in place (according to the execution plan) takes between 2 to 4 seconds. Adding or removing the 10 min condition, to avoid scanning of older chunks doesn’t seem to help either. I’m sure this is a configuration issue on our side, but we can’t get much help from your support or sales teams. Had a sales call with a promise of a follow up from your support team which never happened even after us reaching out again a couple of time.

Frankly this has been a concern for us. We started hopeful for using TS Cloud, but given the level of support for a trial user with a pre-sales call, it’d be difficult for us to justify TSDB internally.

James · October 16, 2022, 11:16pm

This doesn’t sound like a great experience! Let me ping the support team and see what the status is. Could you DM me your company name or the email you signed up under?

Let’s get this sorted out.