Continuous aggregates - apply changes from standard table

Hi, I’m storing IoT senors measurements in a hypertable and informations about sensors health in a standard Postgres table.
I’m trying to build CA query that will calculate hourly averages for every device like this

Measuremenets

time installation_id value
2023-09-06 05:10:00 1 6000.0
2023-09-06 05:20:00 1 4000.0
2023-09-06 05:10:00 2 8000.0
2023-09-06 05:20:00 2 10000.0
2023-09-06 06:30:00 3 10000.0
2023-09-06 06:40:00 3 20000.0

sensor_health

installation_id since till
1 2023-09-06 05:00:00 2023-09-06 06:10:00
3 2023-09-06 06:00:00 2023-09-06 07:00:00

create materialized view measurements_1h WITH (timescaledb.continuous) AS
select time_bucket(‘1 hour’, m.time) as hour, m.installation_id, avg(value), count(value)
from measurements m
inner join sensor_health s on s.installation_id = m.installation_id
where m.time between s.since and s.till
group by hour, m.installation_id;

Result table

hour installation_id avg count
2023-09-06 05:00:00.000000 +00:00 1 5000 2
2023-09-06 06:00:00.000000 +00:00 3 15000 2

Problem:
Changes on measurements hypertable are getting applied on a created view - adding new measurements changes avg value and count,
but inserting new data into standard postgres table doesn’t have any affect.

After executing
‘insert into sensor_health
values (2, ‘2023-09-06 05:00:00’, ‘2023-09-06 06:00:00’)’,
and refreshing the view, still only 2 records are present.
The only way I found to trigger bucket recalculation is adding/modyfing records in hypertable.
Is there any more convenient way to trigger chunk recalculation? Adding column like “last_update” just to trigger this doesn’t seem convinient.

Hi @Jakub , that is a very intersting scenario because generally the continuous aggregates is based on a hypertable. You should use the same concept to make it work. As sensor_health is also time series data, any reason to keep it as a standard table?

sensor_health is not a hypertable, because continuous aggregates doesn’t support joining hypertables.
According to docs joins are allowed only between one hypertable and one standard table. I verified it, indeed it’s not possible to create CA with join on 2 hypertables.

Second reason to use standard table is that in original problem we are working with sensor_suspends table, because only few sensors are suspend periods and it’s just more intuitive/ easier to manage. The problem is that with CA limitations, especially allowing only inner joins and not supporting subqueries, I wasn’t able to create CA with averages of full hours properly (inner join on installation_id would ignore samples with installation_id that doesn’t appear in suspends table). That’s why I created standard materialized view sensor_health, which is basically an inverted version of sensor_suspends, and started working with it. At the beginning I called and described sensor_health as a regular table and not a view just to make the question simpler, and I believe it doesn’t change the problem.

You cannot create a CA with join on 2 hypertables but you can upvote the features here :nerd_face:

An alternative is think about the continuous aggregates as an index to your data. So, you can build a view to access the CA that will contain the joins.

1 Like

Hey @Jakub,

You are right, changes on the regular table that participates on the join of a continuous aggregate are not propagated to the continuous aggregate.

You can check this limitation and others in the relevant section of the documentation that I am linking.

The reason that we have made that choice is due to the way that joins can be setup some times, where any change on the regular table can end up affecting (invalidating as we call this internally) large time ranges or even the whole time range a continuous aggregate is built on top of.

Think about updating a single sensor_health record to {id, 2013, 2024}, either accidentally or on purpose. That would result to the continuous aggregate having to be rematerialized from scratch, with a potentially very expensive query running over all the chunks of the hypertable, which can be 100s of GB or even TBs in size in some cases.

Due to that performance limitation, we have made the decision to focus on regular tables that are acting more like static dimensions, metadata tables that do not change. Those tables cover a sub case that a lot of Timescale users have reported to us that they want to join with in order to further filter or group by using some of their attribute values (e.g. a device type or manufacturer, a specific region, etc).

I understand that this is a limitation that blocks use cases like yours at the moment, but this first version of supporting joins in continuous aggregates allows us to iterate on this difficult problem, while efficiently solving for the most requested use case and gathering feedback as yours. We are planning to revisit this at some point in the future, but it is not currently in our roadmap for the next quarter.