Join two hypertables by timestamp produce different plan (fast & slow)

Hi TimescaleDB team,

I’ve encountered a problem where joining two tables with the same timestamp column produce different query plan and execution time. I expect that it executes as fast as the first query, but somehow the second query is very slow.

Here is the tables chain.transactions, chain.receipts and chain.prices:

create table chain.transactions (
    hash text,
    block_number bigint,
    block_timestamp timestamptz
    -- other columns are removed for simplicity
);

select chain.create_hypertable(
    relation => 'chain.transactions',
    time_column_name => 'block_timestamp',
    chunk_time_interval => INTERVAL '1 day'
);

create unique index transactions_hash_index
    on chain.transactions(hash, block_timestamp desc);

create index transactions_block_number_index
    on chain.transactions (block_number);

create table chain.receipts (
    -- ID
    hash text,
    block_timestamp timestamptz
    -- other columns are removed for simplicity
);

select chain.create_hypertable(
    relation => 'chain.receipts',
    time_column_name => 'block_timestamp',
    chunk_time_interval => INTERVAL '1 day'
);

create unique index receipts_hash_unique_index
    on chain.receipts(hash, block_timestamp desc);

create table chain.prices (
    token_hash text,
    block_timestamp timestamptz,
    price double precision
);

create unique index prices_unique_index
    on chain.prices(
        token_hash,
        block_timestamp desc
    );

Joining chain.transactions and chain.receipts is very fast as expected

explain analyze verbose
select 
    transaction.*,
    receipt.*
from chain.transactions transaction
left join chain.receipts receipt on receipt.hash=transaction.hash and receipt.block_timestamp=transaction.block_timestamp
where receipt.status is not null
order by transaction.block_number desc, transaction.transaction_index asc
limit 100

result:

Planning Time: 2.211 ms
Execution Time: 40.437 ms

Here is the complete query plan: transactions join with receipts | explain.dalibo.com

Meanwhile when I’m trying to join chain.transactions and chain.prices is super slow:

set statement_timeout to '1h';
explain analyze verbose
select 
    transaction.*,
    price.*
from chain.transactions transaction
left join chain.prices price on price.block_timestamp=transaction.block_timestamp and price.token_hash='ETH'
order by transaction.block_number desc, transaction.transaction_index asc
limit 10 -- Set limit to 10 because 100 just too slow

result:

Planning Time: 4.352 ms
Execution Time: 179720.454 ms

Here is the complete query plan: transactions join with prices | explain.dalibo.com

I have fixed the second query with CTE and it’s very fast as expected:

-- explain analyze verbose
with transactions as (
    select *
    from chain.transactions
    order by block_number desc, transaction_index asc
    limit 100
)
select 
    transaction.*,
    price.*
from transactions transaction
left join chain.prices price on 
    price.block_timestamp=transaction.block_timestamp 
    and price.token_hash = 'ETH'

The thing that bother me is that before CTE, two queries join on the same column with the similar index and structure, but still somehow the second query is very slow.

I’m intrigued to know why the second query is very slow? The prices and receipts both have the same unique index structure.

Thank you so much!

Edit:
The table size receipts is greater than prices, receipts and transactions have more than 20M rows so I don’t think size is the problem here.

I think I found the answer!

It turns out the culprit was on oder by block_number desc, transaction_index desc:

Adding new index:

create index transactions_block_number_transaction_index
    on chain.transactions (block_number desc, transaction_index desc);

Now both query mentioned above run fast as expected.

First query:

explain analyze verbose
select * from (
    select 
        transaction.*,
        receipt.*
    from chain.transactions transaction
    left join chain.receipts receipt on receipt.hash=transaction.hash and receipt.block_timestamp=transaction.block_timestamp
) data
order by block_number desc, transaction_index desc
limit 100
Planning Time: 1.825 ms
Execution Time: 57.540 ms

Second query:

explain analyze verbose
select * from (
    select 
        transaction.*,
        price.price as eth_price
    from chain.transactions transaction
    left join chain.prices price on price.token_hash='ETH' and price.block_timestamp=transaction.block_timestamp
) data
order by block_number desc, transaction_index desc
limit 100
Planning Time: 1.216 ms
Execution Time: 1.271 ms

Solved!

Thank you so much

1 Like

Wowww! thanks for sharing your entire optimizing saga!