Continuous aggregate policy not run every hour

ricardosferreira · April 29, 2022, 2:24pm

Hi,

I’m creating a police to refresh a continuous aggregate every hour, however there is hours when the aggregations doesn’t not run.

SELECT add_continuous_aggregate_policy('counters_agg_hoour',
    start_offset => INTERVAL '3 hour',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour');

When the jobs runs it schedule the next run to the end_of_last_run + schedule_interval. so if the jobs takes long time to run there will hours when it doesn’t run.

I could set schedule_interval to 30 minute (for instance) but it will run many times.
There are any way to run repeatedly at same time? I really don’t understand why schedule works like this

Thanks.

LorraineP · April 29, 2022, 4:12pm

Hello @ricardosferreira

Welcome to the forum and thanks for the question. You may get a specific answer to this (and I am sorry that I don’t have the direct knowledge to provide that) but pending that, I wanted to see if this video might help? #TimescaleTuesday: Common questions with Continuous Aggregates - YouTube

I tried to set it at a good starting place for explaining how the scheduling works… if it doesn’t give you what you need though please let us know… and in any case you might get another answer

ryanbooz · May 2, 2022, 3:06pm

@ricardosferreira,

At the moment, the jobs engine only schedules jobs at the completion of a job. There have been discussions and feature requests (#2966 & #2588 are examples) around changing that behavior and I know there is some strong support to see those changes made in the future.

That said, if your continuous aggregate query takes longer to run than the hour slot allotted, then it would seem that there’s an opportunity to improve that query. Have you examined the execution plan for the query outside of the CAGG? Often, when this happens, it points to server settings or query inefficiencies.

How long would you expect that query to run normally for 2-hour window of time? And, in fact, given your example setting above, each time it runs we would presume that it’s mostly just compiling data for the last hour, not the full two hours unless you have a lot of data being modified from now()-'3 hours'::interval' to now()-'2 hours'::interval.