Backfill requires the staging tables to be hypertables

Hi Timescale team -

I am trying to backfill historical data into a distributed hypertable using the official documentation at:
TimescaleDB>How-to Guides>Compression>Backfill historical data

Below are the steps we’ve taken so far:

  1. Create a two node cluster (one access node, and one data node).
  2. Create a distributed hypertable named request_logs which is the destination of all the data to be backfilled.
  3. Created tables (NOT temporary table) for each hour of the data to be backfilled. (Ex: request_logs_2022_3_3H3)
  4. Created the required backfill stored procedures using the backfill.sql file from timescaledb-extras git repository

Now, when we try to backfill the data using decompress_backfill , we’re seeing the error below stating request_logs_2022_3_3h3 is NOT a hypertable.

Could you please suggest if the staging tables should also be hypertables? The official backfill documentation doesn’t mention anything in this regard. Is this a missing step in the documentation?

>CALL decompress_backfill(staging_table=>'request_logs_2022_3_3H3', destination_hypertable=>'request_logs');
ERROR:  table "request_logs_2022_3_3h3" is not a hypertable
CONTEXT:  SQL statement "  
            WITH to_insert AS (DELETE 
            FROM request_logs_2022_3_3h3 --source table
            WHERE start_time >= TIMESTAMPTZ '2022-03-03 03:00:00.004106+00' -- time column >= range start
            AND start_time < TIMESTAMPTZ '2022-03-03 03:59:59.993074+00' -- time column < range end
            RETURNING * )
            INSERT INTO request_logs 
            SELECT * FROM to_insert
            ON CONFLICT DO NOTHING -- ON CONFLICT CLAUSE if it exists
            "
PL/pgSQL function decompress_backfill(regclass,regclass,text,boolean,interval,text[],boolean) line 149 at EXECUTE

Hello @murali

Could you confirm the version of TimescaleDB you are running, please? :pray: And also that TimescaleDB is installed and set up on all nodes per this how-to?

Could you confirm the version of TimescaleDB you are running, please?
Using 2.7.0 version

postgres=# SELECT default_version, installed_version FROM pg_available_extensions where name = 'timescaledb';
 default_version | installed_version 
-----------------+-------------------
 2.7.0           | 2.7.0

And also that TimescaleDB is installed and set up on all nodes per this how-to?

Yes, we did follow this official documentation.

1 Like

Fab, let me see if I can get you some help on this one :slight_smile:

1 Like

@LorraineP Any update on this?

Hi!

I’m not quite sure yet why this would be happening. Did it work for any table? It seems like something is being looked up in the hypertable cache, but I can’t find any reason why it would be from auditing the code and looking at the statement that it says is causing it. I did try it out and wasn’t able to reproduce myself, though I did have some trouble with the job moving part of the procedure, but that was permissions issues, so I removed that. From your error, that doesn’t look like where this issue is.

If you can provide a reproducible test case with a small amount of data so I can run it myself to debug that would be incredibly helpful.

Note, I also didn’t use multinode for this, so it could be something multinode related, but I’ll have to futz with that some more. one question would be if you use a normal hypertable in a similar setup, do you get the same error?

1 Like

Thank you @davidkohn. I reached out to you via DM on timescaledb slack for providing access and repro steps.