Query reads all chunks Select + order by timestamp desc + Limit 1

semir · March 20, 2023, 7:44pm

Hi there,

I’ve run into trouble when trying to select from a hypertable.
Even though the query uses the timestamp and limit 1, all the chunks are being read.

CREATE TABLE “app_values2” (
id integer NOT NULL
REFERENCES app_identifiers
ON DELETE CASCADE,
tstamp timestamp with time zone NOT NULL,
name smallint DEFAULT 0,
value double precision

);
SELECT create_hypertable(‘app_values’, ‘tstamp’,
chunk_time_interval => interval ‘1 day’);

And here is the EXPLAIN output with the query:
https://explain.depesz.com/s/m9E7

I checked and ofc, all the chunks have the time constraint on them.
I have added a brin index on tstamp, too.

Any help is appretiated.
Thanks,
Andárs

jonatasdp · March 22, 2023, 1:59pm

Hi @semir! that is the expected behavior, as timescaledb does not filter any data.

If you want to make it fast, just add a where clause considering a specific time frame.

Example:

select ... where tstamp > now() - INTERVAL '1 week' order by tstamp desc limit 1;

If you know you’ll have some data in the last week, that will be much faster than scanning all the chunks.

If you have several app_identifiers and you want to get the latest record for each device, you can also introduce continuous aggregates to make it a type of index. So, you don’t need to scan in the large table but only in the continuous aggregates.

semir · March 22, 2023, 2:21pm

Hi,

oookay, I got your point. But still I dont understand that in time indexed, time chunked table why does postgres wanna go through all the data to get the very last one?
Isn’t there really nothing else I can do except addint the where clause?
Because the brin index on tstam, postgres should be perfectly aware which chunk it should use, did I mess something up?

Thanks,
András

jonatasdp · March 22, 2023, 2:31pm

Talking internally here, some teammate arose the point that it seems the BRIN index cannot be used for ordering.

BRIN indexes are designed to provide efficient storage and fast lookups for large tables with natural sort order, such as time-series data. The primary use case for BRIN indexes is to speed up range-based queries, not to maintain sorted order.

semir · March 22, 2023, 2:49pm

Thanks for your reply.
But that happens to be the use case here, as I see it.

jonatasdp · March 22, 2023, 2:58pm

You’re welcome @semir!

Talking to another developer here, he mentioned that the chunks will only be excluded at runtime they can’t be done at plan time. So an explain analyze will show that.

Can you confirm if BRIN indexes allow that exclusion to happen at run time or not?

If they don’t, please, fill in a feature request for that. Using just an explain rather than explain analyze will help understand the issue.

semir · March 22, 2023, 3:22pm

Hi,
the reason I did not have analyze on was because postgres started to read through all the 150GB data.
But now I waited for it to finish: OAVW | explain.depesz.com

Thanks,
András

semir · March 29, 2023, 8:26pm

Hi @jonatasdp ,

any new ideas mabye?

Thanks,
András

jonatasdp · April 3, 2023, 1:57pm

Hi András, it seems some of the chunks are being necessary to decompress every time you run the query. If you have any clause that are in the query very often, it would be great to segment or index by this clause.

When you add compression, you can also use segmentby which will not compress the segmentby column, so when you run the query id does not need to decompress the chunk for the where clause.

semir · April 3, 2023, 2:21pm

Hi ,

yes, but tbh the compression was added after I opened this post.
I still dont understand why postgres needs to look for the latest record in any old chunks with an index on the timestamp.

Thanks,
András