Materialized_only=true vs materialized_only=false. query speed drop too much

hi. i have a continuous aggregate when i set materialized_only=true query speed is good but otherwise (materialized_only=false) query speed is too slow. i have a refresh policy which run every five minute and refresh continuous aggregate. also in this 5 minute there is very less data. or may be no data. why speed is too much lower? when refresh policy is 5 minute then i think older than 5 minute should be read from materialized view itself and speed should not be that much slower

i just noticed its slow when order by time used and if not order by time is okey. then summary. its slow when materialized_only=false and order by time used. if materialized_only=true . order by time is not matter

Hi, it will slow down in real time as it will join data from your raw table. If the table is big, it will be more expensive to get the records.

Order by will make it use indices and will speed up it.

1 Like

hi thank you for answer but its not the case real time data is empty for now.
it is what i have done
i create a 1 second candle hyper table and based on that create materialized view for 1 minute 15 minute 1 hour 4 hour 1 day (each based on previous) then get XAU_USD 1 minute data and store them on 1 second candle. then i call refresh and fill upper materialized view. set refresh policy for them and delete data from hyper table itself. for now there is no real time update. but my query speed is too much slower when materialized_only=false vs materialized_only=true. i cant find out why. with explain i find out planing time is too much. for 1d materialized view. for solve this problem i create index and enable compression but no improvement.

Use explain analyze in both queries and you’ll see what I mentioned previously. Real time will always need to join the raw table where the data comes from.

I’m in a conference, sorry for not bringing to much details.

1 Like

thank you for response. but as i mentioned earlier i use explain analyze planing time is about 2 second which is too much in my opinion? is my approach correct at all ? (build continues aggregate 1m 15m 1hour 4hour 1day each table based on previous?)

thank you for response. but as i mentioned earlier i use explain analyze planing time is about 2 second which is too much in my opinion? is my approach correct at all ? (build continues aggregate 1m 15m 1hour 4hour 1day each table based on previous?)

We were discussing a similar issue with another user about using too many levels of granularity.

Maybe this will end up creating a heavy overhead as it has too many levels.

@fabriziomello any thoughts about it?

1 Like

i really stuck on this. i get historical data of xau_usd from in one minute candle store them in a hyper table then build continuous aggregate but now if i do this query

SELECT
time_bucket(‘1 hour’, “time”) AS t1hour,
slug,
max(high) AS high,
first(open, time) AS open,
last(close, time) AS close,
min(low) AS low
FROM history_data_metal_1s srt where time> ‘2021-01-01’ and time < ‘2021-02-01’ and slug=‘XAU_USD’
GROUP BY t1hour, slug
ORDER BY t1hour DESC, slug;

result return in 58.958 ms

but if i want to read from continuous aggregate by this query
select * from history_data_metal_1h where time_1h > ‘2021-01-01’ and time_1h < ‘2021-02-01’ and slug=‘XAU_USD’ order by time_1h asc ;

it takes 30 second… it seems really wrong. but why? where is my fault?

there is steps i do (i create too many stuck after i saw result is wrong. on each table):

CREATE TABLE history_data_metal_1s (
time TIMESTAMPTZ NOT NULL,
slug TEXT NOT NULL,
open DOUBLE PRECISION NULL,
close DOUBLE PRECISION NULL,
high DOUBLE PRECISION NULL,
low DOUBLE PRECISION NULL,
volume INT NULL
);

SELECT create_hypertable(‘history_data_metal_1s’,‘time’);

CREATE INDEX history_data_metal_1s_idx_symbol_time ON history_data_metal_1s (symbol, time asc);

CREATE UNIQUE INDEX history_data_metal_1s_idx_symbol_time_unique ON history_data_metal_1s (slug,time)

CREATE MATERIALIZED VIEW history_data_metal_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘1 minute’, “time”) AS time_1m,
slug,
max(high) AS high,
first(open, time) AS open,
last(close, time) AS close,
min(low) AS low
FROM history_data_metal_1s srt
GROUP BY time_1m, slug;

SELECT add_continuous_aggregate_policy(‘history_data_metal_1m’,
start_offset => INTERVAL ‘4 minute’,
end_offset => INTERVAL ‘0 minute’,
schedule_interval => INTERVAL ‘2 minute’);

CREATE MATERIALIZED VIEW history_data_metal_15m
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘15 minute’, “time_1m”) AS time_15m,
slug,
max(high) AS high,
first(open, time_1m) AS open,
last(close, time_1m) AS close,
min(low) AS low
FROM history_data_metal_1m srt
GROUP BY time_15m, slug;

CREATE MATERIALIZED VIEW history_data_metal_1h
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘1 hour’, “time_15m”) AS time_1h,
slug,
max(high) AS high,
first(open, time_15m) AS open,
last(close, time_15m) AS close,
min(low) AS low
FROM history_data_metal_15m srt
GROUP BY time_1h, slug;

create index history_data_metal_1h_time_idx on history_data_metal_1h (slug,time_1h desc);
create index history_data_metal_1h_time_idx_2 on history_data_metal_1h (slug,time_1h desc);
create index history_data_metal_1h_time_idx_3 on history_data_metal_1h (time_1h desc);
create index history_data_metal_1h_time_idx_4 on history_data_metal_1h (time_1h);
create index history_data_metal_1h_time_idx_5 on history_data_metal_1h (time_1h,slug);
create index history_data_metal_1h_time_idx_6 on history_data_metal_1h (time_1h desc,slug);
create index history_data_metal_1h_time_idx_7 on history_data_metal_1h (slug,time_1h);

SELECT add_continuous_aggregate_policy(‘history_data_metal_1h’,
start_offset => INTERVAL ‘2 h’,
end_offset => INTERVAL ‘0 minute’,
schedule_interval => INTERVAL ‘20 minute’);

any help plz :cry: im going to get crazy :sob:

i really stuck on this. i get historical data of xau_usd from in one minute candle store them in a hyper table then build continuous aggregate but now if i do this query

SELECT
time_bucket(‘1 hour’, “time”) AS t1hour,
slug,
max(high) AS high,
first(open, time) AS open,
last(close, time) AS close,
min(low) AS low
FROM history_data_metal_1s srt where time> ‘2021-01-01’ and time < ‘2021-02-01’ and slug=‘XAU_USD’
GROUP BY t1hour, slug
ORDER BY t1hour DESC, slug;

result return in 58.958 ms

but if i want to read from continuous aggregate by this query
select * from history_data_metal_1h where time_1h > ‘2021-01-01’ and time_1h < ‘2021-02-01’ and slug=‘XAU_USD’ order by time_1h asc ;

it takes 30 second… it seems really wrong. but why? where is my fault?

there is steps i do (i create too many stuck after i saw result is wrong. on each table):

CREATE TABLE history_data_metal_1s (
time TIMESTAMPTZ NOT NULL,
slug TEXT NOT NULL,
open DOUBLE PRECISION NULL,
close DOUBLE PRECISION NULL,
high DOUBLE PRECISION NULL,
low DOUBLE PRECISION NULL,
volume INT NULL
);

SELECT create_hypertable(‘history_data_metal_1s’,‘time’);

CREATE INDEX history_data_metal_1s_idx_symbol_time ON history_data_metal_1s (symbol, time asc);

CREATE UNIQUE INDEX history_data_metal_1s_idx_symbol_time_unique ON history_data_metal_1s (slug,time)

CREATE MATERIALIZED VIEW history_data_metal_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘1 minute’, “time”) AS time_1m,
slug,
max(high) AS high,
first(open, time) AS open,
last(close, time) AS close,
min(low) AS low
FROM history_data_metal_1s srt
GROUP BY time_1m, slug;

SELECT add_continuous_aggregate_policy(‘history_data_metal_1m’,
start_offset => INTERVAL ‘4 minute’,
end_offset => INTERVAL ‘0 minute’,
schedule_interval => INTERVAL ‘2 minute’);

CREATE MATERIALIZED VIEW history_data_metal_15m
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘15 minute’, “time_1m”) AS time_15m,
slug,
max(high) AS high,
first(open, time_1m) AS open,
last(close, time_1m) AS close,
min(low) AS low
FROM history_data_metal_1m srt
GROUP BY time_15m, slug;

CREATE MATERIALIZED VIEW history_data_metal_1h
WITH (timescaledb.continuous) AS
SELECT
time_bucket(‘1 hour’, “time_15m”) AS time_1h,
slug,
max(high) AS high,
first(open, time_15m) AS open,
last(close, time_15m) AS close,
min(low) AS low
FROM history_data_metal_15m srt
GROUP BY time_1h, slug;

create index history_data_metal_1h_time_idx on history_data_metal_1h (slug,time_1h desc);
create index history_data_metal_1h_time_idx_2 on history_data_metal_1h (slug,time_1h desc);
create index history_data_metal_1h_time_idx_3 on history_data_metal_1h (time_1h desc);
create index history_data_metal_1h_time_idx_4 on history_data_metal_1h (time_1h);
create index history_data_metal_1h_time_idx_5 on history_data_metal_1h (time_1h,slug);
create index history_data_metal_1h_time_idx_6 on history_data_metal_1h (time_1h desc,slug);
create index history_data_metal_1h_time_idx_7 on history_data_metal_1h (slug,time_1h);

SELECT add_continuous_aggregate_policy(‘history_data_metal_1h’,
start_offset => INTERVAL ‘2 h’,
end_offset => INTERVAL ‘0 minute’,
schedule_interval => INTERVAL ‘20 minute’);

if i dont create each ca based on previous and directly based on hypertable problem will solved. but i highly preferred to be able create hypertbale based on each other it will help me. for example if you want to have ca for 1M each candle would be 1 month length and if you want build it based on real time data. you must keep real time data for 1 month. it dosent seems good at all. because i dont want to use real time data and it takes too much storage. i preferred to use retention policy and remove real time data each 24 hour. and one month data should be based on 1 week data or 1day data. in this approch for create last month candle we dont need 30 days real time data. just last 30 candle of 1day table is enough (also last day should get it from lower. --if our first ca is 1minute candle we just need 60 second real time data to build all timeframes instead of 1 month-- ) but in summary it makes more sense to create 1month candle based on one day candle not real time data (especially when i have 1 day data. ) but in reality i cant figured it to work. ca based on each other works terrible.

it seems to me right way is ca based on each other but why its so bad when i try to do that way!?

Hi @sarv_sarv, I think the problem is more related to how many levels you have in the hierarchy.

If you try less levels, like 1 minute, 1 hour and 1 day, you’ll have much less data processing and dependencies in the nested levels.

And remember that you can recreate anything that is greater than 1 minute with this granularity.

Can you elaborate on why do you use end_offset => '0 minute'? probably 1 minute will be fine because otherwise it would process the real time data and if you’re using real_time, it will UNION ALL data from the raw table and you don’t need to reprocess it.

From your problem fetching the data from 1h continuous aggregate, it seems something is wrong with your query or you’re missing some index there. Check how to create here.

Also, probably for 1h and this extra analyzes, probably you can disable real_time as you’re using only queries that are not analyzing real time, so it will avoid join the raw data.