That process, I think, would work, it would probably be a little bit weird and you’d have to be sure you don’t have excess data in the first agg, also that any invalidations are picked up correctly or done the way you want.
In terms of what you want to be able to do, it sounds like you’d like to be able to add a column to a continuous aggregate with a default value for the old stuff that doesn’t have it, is that correct? We are considering ways of making continuous aggregates simpler and understanding better what you’d like might help us as we think about some further options. Ideally, that would make your question about best practices obsolete in the future, but we do have a bit of implementation work to do on that, so I can’t say when all this will happen, but we’d love your input.
Sorry for my late answer, I’ve been under some work issues (not related to Timescale, don’t worry ).
It mostly performance counters ; it should insert data only in the last hour ; but indeed stopping aggregate and creating a new one may requires some attention.
Exactly, this is new column(s) with a default value for previous entries. The current use case I have is the following :
Until now, I was measuring the performance of methods in the code. I was referencing it with 4 columns :
Now I also monitor the methods of API Controllers, so I would like to add some new columns hostname, HTTP return code, …
So for previous method’s performance counter, it does not matter, but from now on, I will have for some counters, additional values.
I’m in a similar situation. When building our hypertables, we opted to go with a wide table schema to hold a number of stats, and built continuous aggregates to consolidate historical data to lose granularity over time. We didn’t realize at the time that it would not be possible to add columns for other stats later without losing the historical data in the materialized views. Perhaps a narrow table strategy would have been wiser. Looking forward to any possible solutions the tsdb team can come up with. Happy to provide input on use cases if it’s useful.
Just to chime in that this is something that we would also appreciate a lot:
We have a large table (well, several actually) where we want to add a new column with a default value of 0, and need to migrate the materialized view (created using continuous aggregate) as well in order to preserve access to historical data. Setting a single default value for the new column for all current data in the materialized view would be perfectly acceptable.
Are there any plans for such functionality, such that it may be worth waiting for it, or should we try the work-around?
Glad I read this article because it got me thinking - as a startup, knowing that we can save time not building our own materialized view mechanism (non trivial) but at the cost of having to define that view in terms of an immutable schema is a quite scary.
But what really got me thinking was our intention to apply a data retention policy. If we do that then no matter how we build the view, we’d have a real process to go through to rebuild it - taking our raw data, rehydrating it (in a production system) and rebuilding the view.
I’m think I’m going to focus on the tiered storage approach initially? At least while our solution domain stabilizes, the thought that if we have to recreate a materialized view, we can bypass having to re-load our data seems pretty compelling.