GIN index performance

Hi,

I am using Timescale 2.9.1 on Azure’s Postgres Flex server and am seeing bad performance with a GIN index on a text array column for a seemingly simple query. Here’s the relevant table and index definitions:

                                                                Table "device_events"
                    Column                    |           Type           | Collation | Nullable |                          Default                          
----------------------------------------------+--------------------------+-----------+----------+-----------------------------------------------------------
 id                                           | text                     |           | not null | 
 time                                         | timestamp with time zone |           | not null | 
 subscriber_id                                | text                     |           | not null | 
 dev_eui                                      | text                     |           | not null | 
 m_type                                       | smallint                 |           |          | 
 mic                                          | text                     |           |          | 
 spreading_factor                             | smallint                 |           |          | 
 channel                                      | text                     |           |          | 
 gateway_node_ids                             | text[]                   |           |          | 
Indexes:
    "device_events_pkey" PRIMARY KEY, btree ("time", id)
    "device_events_subscriber_id_dev_eui_idx" btree (subscriber_id, dev_eui, "time" DESC)
    "device_events_subscriber_id_gateway_node_id_idx" gin (subscriber_id, gateway_node_ids, "time")
    "device_events_subscriber_id_idx" btree (subscriber_id, "time" DESC)
    "device_events_time_idx" btree ("time" DESC)
Foreign-key constraints:
    "device_events_device_event_type_fk" FOREIGN KEY (device_event_type_id) REFERENCES device_event_types(id)
Triggers:
    ts_insert_blocker BEFORE INSERT ON device_events FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Number of child tables: 17 (Use \d+ to list them.)

When executing this query:

SELECT id, time, gateway_node_ids
FROM device_events
WHERE time >= NOW() - INTERVAL '7 days'
  AND time <= NOW()
  AND subscriber_id = 'AAAAAAAAAAA'
  AND gateway_node_ids @> ARRAY ['BBBBBBBBBBBBB']
ORDER BY time
LIMIT 10;

The performance is not great. There are 735 rows that match the criteria, but I am limiting it to 10. Here’s the explain plan:

                                                                                              QUERY PLAN                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4522.70..4522.73 rows=10 width=96) (actual time=23160.228..23160.238 rows=10 loops=1)
   Buffers: shared hit=2366386 read=1
   I/O Timings: shared/local read=0.009
   ->  Sort  (cost=4522.70..4522.83 rows=50 width=96) (actual time=23160.227..23160.236 rows=10 loops=1)
         Sort Key: device_events."time"
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=2366386 read=1
         I/O Timings: shared/local read=0.009
         ->  Custom Scan (ChunkAppend) on device_events  (cost=992.01..4521.62 rows=50 width=96) (actual time=2539.064..23160.061 rows=735 loops=1)
               Chunks excluded during startup: 0
               Buffers: shared hit=2366386 read=1
               I/O Timings: shared/local read=0.009
               ->  Bitmap Heap Scan on _hyper_8_21_chunk  (cost=992.01..994.04 rows=1 width=96) (actual time=2539.063..2539.077 rows=26 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=6
                     Buffers: shared hit=278543 read=1
                     I/O Timings: shared/local read=0.009
                     ->  Bitmap Index Scan on _hyper_8_21_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..992.01 rows=1 width=0) (actual time=2539.051..2539.051 rows=26 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=278537 read=1
                           I/O Timings: shared/local read=0.009
               ->  Bitmap Heap Scan on _hyper_8_22_chunk  (cost=454.01..456.04 rows=1 width=95) (actual time=3741.666..3741.707 rows=105 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=19
                     Buffers: shared hit=414292
                     ->  Bitmap Index Scan on _hyper_8_22_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..454.01 rows=1 width=0) (actual time=3741.653..3741.653 rows=105 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=414273
               ->  Bitmap Heap Scan on _hyper_8_23_chunk  (cost=1096.01..1098.04 rows=1 width=95) (actual time=3795.151..3795.193 rows=105 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=16
                     Buffers: shared hit=407034
                     ->  Bitmap Index Scan on _hyper_8_23_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..1096.01 rows=1 width=0) (actual time=3795.137..3795.137 rows=105 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=407018
               ->  Bitmap Heap Scan on _hyper_8_24_chunk  (cost=806.01..808.04 rows=1 width=95) (actual time=4007.366..4007.408 rows=107 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=16
                     Buffers: shared hit=428017
                     ->  Bitmap Index Scan on _hyper_8_24_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..806.01 rows=1 width=0) (actual time=4007.351..4007.352 rows=107 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=428001
               ->  Bitmap Heap Scan on _hyper_8_25_chunk  (cost=728.01..730.04 rows=1 width=95) (actual time=4062.589..4062.633 rows=103 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=16
                     Buffers: shared hit=426742
                     ->  Bitmap Index Scan on _hyper_8_25_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..728.01 rows=1 width=0) (actual time=4062.576..4062.576 rows=103 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=426726
               ->  Bitmap Heap Scan on _hyper_8_26_chunk  (cost=140.01..142.04 rows=1 width=95) (actual time=3737.864..3737.903 rows=105 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=14
                     Buffers: shared hit=372264
                     ->  Bitmap Index Scan on _hyper_8_26_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..140.01 rows=1 width=0) (actual time=3737.849..3737.849 rows=105 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=372250
               ->  Bitmap Heap Scan on _hyper_8_27_chunk  (cost=220.01..222.04 rows=1 width=95) (actual time=1275.808..1275.856 rows=105 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=23
                     Buffers: shared hit=39434
                     ->  Bitmap Index Scan on _hyper_8_27_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..220.01 rows=1 width=0) (actual time=1275.794..1275.794 rows=105 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=39411
               ->  Bitmap Heap Scan on _hyper_8_31_chunk  (cost=22.77..71.33 rows=43 width=96) (actual time=0.160..0.213 rows=79 loops=1)
                     Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Heap Blocks: exact=40
                     Buffers: shared hit=60
                     ->  Bitmap Index Scan on _hyper_8_31_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..22.76 rows=43 width=0) (actual time=0.153..0.153 rows=79 loops=1)
                           Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                           Buffers: shared hit=20
Planning:
   Buffers: shared hit=186
 Planning Time: 1.431 ms
 Execution Time: 23167.306 ms

Even getting rid of the sort does not improve things as much as I would have thought.

                                                                                          QUERY PLAN                                                                                           
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=992.01..1697.93 rows=10 width=96) (actual time=2504.352..2504.364 rows=10 loops=1)
   Buffers: shared hit=278091 read=1
   I/O Timings: shared/local read=0.009
   ->  Custom Scan (ChunkAppend) on device_events  (cost=992.01..4521.62 rows=50 width=96) (actual time=2504.351..2504.361 rows=10 loops=1)
         Chunks excluded during startup: 0
         Buffers: shared hit=278091 read=1
         I/O Timings: shared/local read=0.009
         ->  Bitmap Heap Scan on _hyper_8_21_chunk  (cost=992.01..994.04 rows=1 width=96) (actual time=2504.350..2504.355 rows=10 loops=1)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               Heap Blocks: exact=2
               Buffers: shared hit=278091 read=1
               I/O Timings: shared/local read=0.009
               ->  Bitmap Index Scan on _hyper_8_21_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..992.01 rows=1 width=0) (actual time=2504.336..2504.337 rows=25 loops=1)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
                     Buffers: shared hit=278089 read=1
                     I/O Timings: shared/local read=0.009
         ->  Bitmap Heap Scan on _hyper_8_22_chunk  (cost=454.01..456.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_22_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..454.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_23_chunk  (cost=1096.01..1098.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_23_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..1096.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_24_chunk  (cost=806.01..808.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_24_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..806.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_25_chunk  (cost=728.01..730.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_25_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..728.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_26_chunk  (cost=140.01..142.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_26_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..140.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_27_chunk  (cost=220.01..222.04 rows=1 width=95) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_27_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..220.01 rows=1 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
         ->  Bitmap Heap Scan on _hyper_8_31_chunk  (cost=22.77..71.33 rows=43 width=96) (never executed)
               Recheck Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
               ->  Bitmap Index Scan on _hyper_8_31_chunk_device_events_subscriber_id_gateway_node_id_i  (cost=0.00..22.76 rows=43 width=0) (never executed)
                     Index Cond: ((subscriber_id = 'AAAAAAAAAAA'::text) AND (gateway_node_ids @> '{BBBBBBBBBBBBB}'::text[]) AND ("time" >= (now() - '7 days'::interval)) AND ("time" <= now()))
 Planning:
   Buffers: shared hit=184
 Planning Time: 1.288 ms
 Execution Time: 2504.580 ms

Any insights would be appreciated!

Figured it out. The key was to remove the time from the GIN index. GIN indexes are not sorted, so this was causing the problem.

1 Like

Very good! thanks for sharing the resolution!