Query planner regression

bkdd · June 5, 2023, 10:28pm

We have a large timescale db (1 TB data) that had to be upgraded to postgres 14, timescale 2.11.1 and we are experiencing a regression in query planner time

Old db and new db contain the same dataset, same schema/indexes and use the same compute resources but the query planner time is sometimes 10x slower on the new machine, specs below.

Chunk interval is longer on the new db but it has more chunks for the same dataset which is strange. Compression is enabled for both dbs.

New db (slow)
PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
Timescale: 2.11.0
Chunk interval 3 days
SELECT count(*) from show_chunks(‘raw.numericsamples’);
Chunk count: 2650

Old db
PostgreSQL 12.4 (Ubuntu 12.4-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
Timescale: 1.7.4
Chunk interval 1 day
SELECT count(*) from show_chunks(‘raw.numericsamples’);
Chunk count: 1267

**Here is the explain plan for the pretty basic query. **
Old db (fast)
HGRJB : old db | explain.depesz.com
New db (slow)
c8RJ | explain.depesz.com

Someone please help us diagnose why the planner is so slow for any query that touches the timeseries data.

jonatasdp · June 7, 2023, 12:24pm

It seems you have a problem in the caching warm up. Think that Timescaledb extension will just cache all the chunks metadata on memory. Maybe after a few queries, you’ll have all the metadata loaded in the memory. Probably your old db is just better because you have more queries from your production workload and then you have data that is already on memory, saving some IO.

Also, the chunk size being 3 times bigger it means you’ll have less chunks on memory and need more IO trade offs to exchange the memory chunks. At least that would be my first guess.

One idea to test this hypothesis is try to warm up your cache running some production queries. You can get them from pg_stat_statements if you’re using it.

bkdd · June 8, 2023, 1:33am

Probably not caching as the new deployment is being used as well. The query used in the explain examples was extracted from the query log and ran multiple times manually before doing explain.

bkdd · June 12, 2023, 10:19pm

Remade the hypertable with larger 7 day chunks and the speed is orders of magnitude faster.
tuAY | explain.depesz.com
Timescale seems to shit itself somewhere between 1100 and 2600 chunks, not good!

jonatasdp · June 13, 2023, 11:30am

Yeah, SeqScan is present there and seems not the most efficient way to make it as it will traverse the entire table. Have you tried to create an index for pointname?

((pointname)::text = ANY ('{MAC.SCADA.MC1P023Ctrl,MAC.SCADA.MC1F20681Flow}'::text[]))

bkdd · June 13, 2023, 9:52pm

Sure, but that table isnt the hypertable and is quite small. The underlying issue the query planner gets really slow if the number of chunks in a hypertable gets above a certain threshold

jonatasdp · August 2, 2023, 7:15pm

For some reason I lost the track here, did you manage to fix it?

If you need to traverse all the chunks or a very large amount, it will really kill the performance you expect.

Maybe optimize the chunk size to balance query performance will help you.