Interestingly trying this on a regular hypertable the array version is still faster. While the sub-select is able to use an index the array version chooses a parallel sequential scan, and in this case since we are using most of the data from the chunks that ends up being more effective.
Another interesting thing is I’m unable to get the query planner to use a parallel sequential scan when querying the table directly on one of the data nodes with the remote SQL generated by the access node.
Query Regular Hypertable with SELECT
with power_series_data as materialized (
select id, labels
from series_telemetry_power s
where
s.labels && (select coalesce(array_agg(l.id), array[]::int[]) from label l where l.key = 'Location' and substring(l.value from '^x[0-9]+c[0-9]+s[0-9]+b[0-9]+$') in ('x9000c1s0b0','x9000c1s0b1','x9000c1s1b0','x9000c1s1b1','x9000c1s2b0','x9000c1s2b1','x9000c1s4b0','x90
00c1s4b1'))
)
select
time_bucket(60000, m.ts) time,
avg(val) Power
from telemetry_power m
where
m.series_id IN (select id from power_series_data)
and m.ts between 1650663853916 and 1651268653916
group by time
order by 1 desc;
PLAN:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=82707.16..85312.12 rows=115776 width=16) (actual time=74947.088..100027.483 rows=5756 loops=1)
Output: (time_bucket('60000'::bigint, m_1.ts)), avg(m_1.val)
Group Key: (time_bucket('60000'::bigint, m_1.ts))
Buffers: shared hit=655475, temp read=14869 written=14925
CTE power_series_data
-> Bitmap Heap Scan on public.series_telemetry_power s (cost=4.43..7.55 rows=4 width=60) (actual time=0.409..0.885 rows=40 loops=1)
Output: s.id, s.labels
Recheck Cond: ((s.labels)::integer[] && $0)
Heap Blocks: exact=4
Buffers: shared hit=16
InitPlan 1 (returns $0)
-> Aggregate (cost=2.19..2.20 rows=1 width=32) (actual time=0.308..0.334 rows=1 loops=1)
Output: COALESCE(array_agg(l.id), '{}'::integer[])
Buffers: shared hit=3
-> Index Only Scan using label_key_value_id_key on public.label l (cost=0.27..2.18 rows=1 width=4) (actual time=0.068..0.228 rows=8 loops=1)
Output: l.key, l.value, l.id
Index Cond: (l.key = 'Location'::text)
Filter: ("substring"(l.value, '^x[0-9]+c[0-9]+s[0-9]+b[0-9]+$'::text) = ANY ('{x9000c1s0b0,x9000c1s0b1,x9000c1s1b0,x9000c1s1b1,x9000c1s2b0,x9000c1s2b1,x9000c1s4b0,x9000c1s4b1}'::text[]))
Rows Removed by Filter: 19
Heap Fetches: 0
Buffers: shared hit=3
-> Bitmap Index Scan on series_telemetry_power_labels_idx (cost=0.00..2.23 rows=4 width=0) (actual time=0.386..0.392 rows=40 loops=1)
Index Cond: ((s.labels)::integer[] && $0)
Buffers: shared hit=12
-> Sort (cost=82699.62..82989.06 rows=115776 width=16) (actual time=74945.128..87300.804 rows=1941855 loops=1)
Output: (time_bucket('60000'::bigint, m_1.ts)), m_1.val
Sort Key: (time_bucket('60000'::bigint, m_1.ts)) DESC
Sort Method: external merge Disk: 49440kB
Buffers: shared hit=655475, temp read=14869 written=14925
-> Nested Loop (cost=262.36..71801.99 rows=115776 width=16) (actual time=6.529..61589.393 rows=1941855 loops=1)
Output: time_bucket('60000'::bigint, m_1.ts), m_1.val
Buffers: shared hit=655475
-> HashAggregate (cost=0.09..0.13 rows=4 width=4) (actual time=1.884..2.239 rows=40 loops=1)
Output: power_series_data.id
Group Key: power_series_data.id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=16
-> CTE Scan on power_series_data (cost=0.00..0.08 rows=4 width=4) (actual time=0.424..1.494 rows=40 loops=1)
Output: power_series_data.id, power_series_data.labels
Buffers: shared hit=16
-> Append (cost=262.27..17575.41 rows=30269 width=20) (actual time=4.359..937.927 rows=48546 loops=40)
Buffers: shared hit=655459
-> Bitmap Heap Scan on _timescaledb_internal._hyper_7_309_chunk m_1 (cost=262.27..8646.74 rows=15381 width=20) (actual time=4.203..168.681 rows=23515 loops=40)
Output: m_1.ts, m_1.val, m_1.series_id
Recheck Cond: ((m_1.series_id = power_series_data.id) AND (m_1.ts >= '1650663853916'::bigint) AND (m_1.ts <= '1651268653916'::bigint))
Heap Blocks: exact=312723
Buffers: shared hit=316447
-> Bitmap Index Scan on _hyper_7_309_chunk_power_series_id_ts (cost=0.00..258.42 rows=15381 width=0) (actual time=3.226..3.226 rows=23515 loops=40)
Index Cond: ((m_1.series_id = power_series_data.id) AND (m_1.ts >= '1650663853916'::bigint) AND (m_1.ts <= '1651268653916'::bigint))
Buffers: shared hit=3724
-> Bitmap Heap Scan on _timescaledb_internal._hyper_7_307_chunk m_2 (cost=253.78..8777.33 rows=14888 width=20) (actual time=4.546..176.717 rows=25031 loops=40)
Output: m_2.ts, m_2.val, m_2.series_id
Recheck Cond: ((m_2.series_id = power_series_data.id) AND (m_2.ts >= '1650663853916'::bigint) AND (m_2.ts <= '1651268653916'::bigint))
Heap Blocks: exact=335060
Buffers: shared hit=339012
-> Bitmap Index Scan on _hyper_7_307_chunk_power_series_id_ts (cost=0.00..250.06 rows=14888 width=0) (actual time=3.533..3.533 rows=25031 loops=40)
Index Cond: ((m_2.series_id = power_series_data.id) AND (m_2.ts >= '1650663853916'::bigint) AND (m_2.ts <= '1651268653916'::bigint))
Buffers: shared hit=3952
Settings: effective_cache_size = '5969MB', effective_io_concurrency = '256', enable_partitionwise_aggregate = 'on', jit = 'off', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', work_mem = '2547kB'
Planning:
Buffers: shared hit=63
Planning Time: 2.046 ms
Execution Time: 100074.702 ms
(63 rows)
Query Regular Hypertable with ARRAY
with power_series_data as materialized (
select id, labels
from series_telemetry_power s
where
s.labels && (select coalesce(array_agg(l.id), array[]::int[]) from label l where l.key = 'Location' and substring(l.value from '^x[0-9]+c[0-9]+s[0-9]+b[0-9]+$') in ('x9000c1s0b0','x9000c1s0b1','x9000c1s1b0','x9000c1s1b1','x9000c1s2b0','x9000c1s2b1','x9000c1s4b0','x90
00c1s4b1'))
)
select
time_bucket(60000, m.ts) time,
avg(val) Power
from telemetry_power m
where
array[m.series_id] && (select coalesce(array_agg(id), array[]::integer[]) from power_series_data)
and m.ts between 1650663853916 and 1651268653916
group by time
order by 1 desc;
PLAN:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=111578.47..125455.53 rows=85133 width=16) (actual time=9877.033..38793.776 rows=5756 loops=1)
Output: (time_bucket('60000'::bigint, m_2.ts)), avg(m_2.val)
Group Key: (time_bucket('60000'::bigint, m_2.ts))
Buffers: shared hit=64857, temp read=8413 written=8468
CTE power_series_data
-> Bitmap Heap Scan on public.series_telemetry_power s (cost=4.43..7.55 rows=4 width=60) (actual time=0.428..0.798 rows=40 loops=1)
Output: s.id, s.labels
Recheck Cond: ((s.labels)::integer[] && $0)
Heap Blocks: exact=4
Buffers: shared hit=16
InitPlan 1 (returns $0)
-> Aggregate (cost=2.19..2.20 rows=1 width=32) (actual time=0.322..0.349 rows=1 loops=1)
Output: COALESCE(array_agg(l.id), '{}'::integer[])
Buffers: shared hit=3
-> Index Only Scan using label_key_value_id_key on public.label l (cost=0.27..2.18 rows=1 width=4) (actual time=0.075..0.238 rows=8 loops=1)
Output: l.key, l.value, l.id
Index Cond: (l.key = 'Location'::text)
Filter: ("substring"(l.value, '^x[0-9]+c[0-9]+s[0-9]+b[0-9]+$'::text) = ANY ('{x9000c1s0b0,x9000c1s0b1,x9000c1s1b0,x9000c1s1b1,x9000c1s2b0,x9000c1s2b1,x9000c1s4b0,x9000c1s4b1}'::text[]))
Rows Removed by Filter: 19
Heap Fetches: 0
Buffers: shared hit=3
-> Bitmap Index Scan on series_telemetry_power_labels_idx (cost=0.00..2.23 rows=4 width=0) (actual time=0.400..0.406 rows=40 loops=1)
Index Cond: ((s.labels)::integer[] && $0)
Buffers: shared hit=12
InitPlan 3 (returns $2)
-> Aggregate (cost=0.09..0.10 rows=1 width=32) (actual time=1.813..1.853 rows=1 loops=1)
Output: COALESCE(array_agg(power_series_data.id), '{}'::integer[])
Buffers: shared hit=16
-> CTE Scan on power_series_data (cost=0.00..0.08 rows=4 width=4) (actual time=0.462..1.483 rows=40 loops=1)
Output: power_series_data.id, power_series_data.labels
Buffers: shared hit=16
-> Gather Merge (cost=111570.82..123665.81 rows=101015 width=16) (actual time=9874.894..26644.822 rows=1941855 loops=1)
Output: (time_bucket('60000'::bigint, m_2.ts)), m_2.val
Workers Planned: 4
Params Evaluated: $2
Workers Launched: 4
Buffers: shared hit=64857, temp read=8413 written=8468
-> Sort (cost=110570.76..110633.89 rows=25254 width=16) (actual time=9856.907..12265.904 rows=388371 loops=5)
Output: (time_bucket('60000'::bigint, m_2.ts)), m_2.val
Sort Key: (time_bucket('60000'::bigint, m_2.ts)) DESC
Sort Method: external merge Disk: 9960kB
Buffers: shared hit=64841, temp read=8413 written=8468
Worker 0: actual time=9852.448..12200.259 rows=386135 loops=1
Sort Method: external merge Disk: 9840kB
Buffers: shared hit=13313, temp read=1675 written=1686
Worker 1: actual time=9853.430..12294.276 rows=390005 loops=1
Sort Method: external merge Disk: 9944kB
Buffers: shared hit=12441, temp read=1688 written=1699
Worker 2: actual time=9852.388..12194.623 rows=384456 loops=1
Sort Method: external merge Disk: 9800kB
Buffers: shared hit=13293, temp read=1670 written=1681
Worker 3: actual time=9854.000..12318.312 rows=390651 loops=1
Sort Method: external merge Disk: 9960kB
Buffers: shared hit=12427, temp read=1690 written=1701
-> Parallel Append (cost=0.00..108724.16 rows=25254 width=16) (actual time=1.355..7384.293 rows=388371 loops=5)
Buffers: shared hit=64813
Worker 0: actual time=1.558..7387.841 rows=386135 loops=1
Buffers: shared hit=13306
Worker 1: actual time=1.713..7374.302 rows=390005 loops=1
Buffers: shared hit=12434
Worker 2: actual time=1.823..7386.526 rows=384456 loops=1
Buffers: shared hit=13286
Worker 3: actual time=1.651..7377.153 rows=390651 loops=1
Buffers: shared hit=12420
-> Parallel Seq Scan on _timescaledb_internal._hyper_7_307_chunk m_2 (cost=0.00..56178.52 rows=13064 width=16) (actual time=0.677..1392.274 rows=200252 loops=5)
Output: time_bucket('60000'::bigint, m_2.ts), m_2.val
Filter: ((ARRAY[m_2.series_id] && $2) AND (m_2.ts >= '1650663853916'::bigint) AND (m_2.ts <= '1651268653916'::bigint))
Rows Removed by Filter: 844862
Buffers: shared hit=33520
Worker 0: actual time=0.028..535.514 rows=73750 loops=1
Buffers: shared hit=2879
Worker 1: actual time=1.697..2680.368 rows=390005 loops=1
Buffers: shared hit=12434
Worker 2: actual time=0.034..535.094 rows=72383 loops=1
Buffers: shared hit=2857
Worker 3: actual time=1.617..2679.189 rows=390651 loops=1
Buffers: shared hit=12420
-> Parallel Seq Scan on _timescaledb_internal._hyper_7_309_chunk m_1 (cost=0.00..52419.37 rows=12190 width=16) (actual time=1.118..2177.974 rows=313532 loops=3)
Output: time_bucket('60000'::bigint, m_1.ts), m_1.val
Filter: ((ARRAY[m_1.series_id] && $2) AND (m_1.ts >= '1650663853916'::bigint) AND (m_1.ts <= '1651268653916'::bigint))
Rows Removed by Filter: 1311755
Buffers: shared hit=31293
Worker 0: actual time=1.541..2181.907 rows=312385 loops=1
Buffers: shared hit=10427
Worker 2: actual time=1.799..2182.108 rows=312073 loops=1
Buffers: shared hit=10429
Settings: effective_cache_size = '5969MB', effective_io_concurrency = '256', enable_partitionwise_aggregate = 'on', jit = 'off', max_parallel_workers_per_gather = '4', random_page_cost = '1.1', work_mem = '2547kB'
Planning:
Buffers: shared hit=63
Planning Time: 2.098 ms
Execution Time: 38831.371 ms
(91 rows)