ALTER HyperTable and Continuous Aggregates

ldubrois · March 14, 2022, 8:14am

Hi community,

I am new to TimeScale ; sorry if this is something simple, but I can’t find the solution on threads and google.

Let’s say I have a table with columns:

Time
Key1
Value

I configure it as an hypertable and set the retention policy to 7 days.

Next I create a Continuous Aggregates with a time bucket to 1 hour making the group on Key1, and AVG on Value.

Now, I want now to add a Key2 column on the original table, and update the Continuous Aggregate to group on Key1 and Key2.

I have seen I can ALTER the original table to add the new column with a default value successfully.

My question is : How can I update the materialized view / Continuous Aggregate to add the column without loosing the data (because I only have 7 days on the original table).

Thanks

LorraineP · March 14, 2022, 11:20am

Hello @Idubrois

Unfortunately, you can’t alter a continuous aggregate, you’d have to build it again. To keep the data, then you’d maybe have to create a new one with the new column and retain the old one?

There are a couple of links here that explain this subject:

As you’re new to TimescaleDB, I think that you might find some of the YouTube content useful especially these editions:

If anyone has some better suggestions for you here, hopefully they’ll chime in.

ldubrois · March 14, 2022, 3:39pm

Hi Lorraine,

Thanks for your answer.

Does it means :

I should stop the Continuous Aggregate (but keep old data)
ALTER my source hypertable
Create a new Continuous Aggregate using the newly added column
Unify previous continuous aggregate with new one created using a view or something

In this case, is it a best practices to always version the continuous aggreagtes in case of a change, and always create a view to ‘hide’ the possible multiple aggreagtes on behind ?

Like this ?

SourceTable (hypetable)
SourceTable_Daily_v1 (continous aggregate)
SourceTable_Daily (View over SourceTable_Daily_v1)

And if the schema changes I will have :

SourceTable (hypetable)
SourceTable_Daily_v1 (continous aggregate ; stopped)
SourceTable_Daily_v2 (continous aggregate ; in progress)
SourceTable_Daily (View over SourceTable_Daily_v1 UNION SourceTable_Daily_v2 )

Thanks

Reagrds,
Luc

davidkohn · March 15, 2022, 6:41pm

That process, I think, would work, it would probably be a little bit weird and you’d have to be sure you don’t have excess data in the first agg, also that any invalidations are picked up correctly or done the way you want.
In terms of what you want to be able to do, it sounds like you’d like to be able to add a column to a continuous aggregate with a default value for the old stuff that doesn’t have it, is that correct? We are considering ways of making continuous aggregates simpler and understanding better what you’d like might help us as we think about some further options. Ideally, that would make your question about best practices obsolete in the future, but we do have a bit of implementation work to do on that, so I can’t say when all this will happen, but we’d love your input.

ldubrois · March 22, 2022, 8:10am

Hi

Sorry for my late answer, I’ve been under some work issues (not related to Timescale, don’t worry ).

It mostly performance counters ; it should insert data only in the last hour ; but indeed stopping aggregate and creating a new one may requires some attention.
Exactly, this is new column(s) with a default value for previous entries. The current use case I have is the following :
Until now, I was measuring the performance of methods in the code. I was referencing it with 4 columns :

Assembly,
Namespace,
Class,
Method.

Now I also monitor the methods of API Controllers, so I would like to add some new columns hostname, HTTP return code, …
So for previous method’s performance counter, it does not matter, but from now on, I will have for some counters, additional values.

Thanks

ldd · September 8, 2022, 4:11am

I’m in a similar situation. When building our hypertables, we opted to go with a wide table schema to hold a number of stats, and built continuous aggregates to consolidate historical data to lose granularity over time. We didn’t realize at the time that it would not be possible to add columns for other stats later without losing the historical data in the materialized views. Perhaps a narrow table strategy would have been wiser. Looking forward to any possible solutions the tsdb team can come up with. Happy to provide input on use cases if it’s useful.

kerrycfendley · September 15, 2022, 11:26am

If I have multiple stocks with identical timestamps but different IDs, I need the primary key to be a combinatory key (datetime plus asset ID) this can be done yes?

LorraineP · September 20, 2022, 12:14pm

Hi @kerrycfendley

It can be done, but on the whole, we don’t recommend this approach if you are using a single node. You can read more on the recommended practices here https://docs.timescale.com/timescaledb/latest/how-to-guides/hypertables/about-hypertables/#best-practices-for-time-partitioning/

For this kind of requirement in a single node we’d suggest multiple tables. If you need more let me know and I’ll see if I can get one of the team to check in on your use case. Hope this helps.

Andreas_Skyman · January 20, 2023, 11:13am

Just to chime in that this is something that we would also appreciate a lot:

We have a large table (well, several actually) where we want to add a new column with a default value of 0, and need to migrate the materialized view (created using continuous aggregate) as well in order to preserve access to historical data. Setting a single default value for the new column for all current data in the materialized view would be perfectly acceptable.

Are there any plans for such functionality, such that it may be worth waiting for it, or should we try the work-around?

kelly.cliffe · May 10, 2023, 10:02pm

Glad I read this article because it got me thinking - as a startup, knowing that we can save time not building our own materialized view mechanism (non trivial) but at the cost of having to define that view in terms of an immutable schema is a quite scary.

But what really got me thinking was our intention to apply a data retention policy. If we do that then no matter how we build the view, we’d have a real process to go through to rebuild it - taking our raw data, rehydrating it (in a production system) and rebuilding the view.

I’m think I’m going to focus on the tiered storage approach initially? At least while our solution domain stabilizes, the thought that if we have to recreate a materialized view, we can bypass having to re-load our data seems pretty compelling.

Emilie · January 14, 2025, 10:48am

Hello there,

Thank you @ldubrois for this post.
I’m in the same situation, and I know I will have to add more columns to the view in the future.
I need a comfortable process to manage it and preserve my aggregations because I have a retention policy on raw data of course, like @kelly.cliffe I would like to avoid recalculate everything and lose some of them.
Is there something new here ? What is the best approach ?

Thanks for your help,

Emilie

jonatasdp · January 14, 2025, 8:08pm

I don’t think we have news yet @Emilie , but you can certainly use the idea that was exposed:

stop the actual aggregation.
create a new aggregation with the new column

Create a view or function to UNION ALL, putting the defaults for the old data.