Performance issue in write-heavy workload

sonny82 · April 25, 2024, 12:21pm

Hi,

We are in the process of migrating our IoT backend to use TimescaleDB for storing IoT data.
We are using TimescaleDB 2.14.2 on PostgreSQL 14.
We are currently executing load tests with an emulator that emulates the expected load.
Our data load is very write-heavy. There are multiple threads writing to the same table.
Rows to be inserted are partitioned into batches of 1000 rows which are then processed by 16 worker threads which call functions which contain most of the DAL logic. Before a record is inserted, the logic has to check if the record already exists or not, based on that it performs insert or update, etc.
During those load tests compression is not enabled and chunk size is set to 1 day.

Our hypertable:


CREATE TABLE "Result" (
    "MPRTFk" INT NOT NULL,
    "ResultTimeStamp" TIMESTAMP NOT NULL,
    "ResultValue" DOUBLE PRECISION NOT NULL,
    "Status" INT NOT NULL
);

ALTER TABLE "Result" ADD CONSTRAINT "PK_Res" PRIMARY KEY ("MPRTFk", "ResultTimeStamp");
SELECT create_hypertable('"Result"', 'ResultTimeStamp', chunk_time_interval => 86400000000);

When we start the test, the initial ingestion speed is at around 500k rows/sec but then over time gradually starts to drop as the number of rows and chunks increases. If we leave it running for 1 day, the insert speed is at around 30k rows / sec.
After some analysis we found out that the problem is in the following function which joins the Result table to check which old values already exist:

CREATE OR REPLACE FUNCTION res_fill_temp(
    _results ResultUDT[ ]
)
RETURNS TABLE(
    MPRTFk INT,
    ResultTimeStamp TIMESTAMP,
    NewResultValue DOUBLE PRECISION,
    NewStatus INT,
    OldResultValue DOUBLE PRECISION,
    OldStatus INT,
    ObjResStatus INT
) AS $$
BEGIN
    -- Return all rows from _results with a LEFT JOIN to fill in old values where they exist
    RETURN QUERY
    SELECT
        rNew.MPRTFk,
        rNew.ResultTimeStamp,
        rNew.ResultValue AS NewResultValue,
        rNew.Status AS NewStatus,
        rOld."ResultValue" AS OldResultValue, 
        rOld."Status" AS OldStatus,
        (rObjRes."Status" & 'x80000000'::bit(32)::integer) AS ObjResStatus
    FROM unnest(_results) AS rNew
    LEFT JOIN public."Result" rOld ON
        rNew.MPRTFk = rOld."MPRTFk" AND
        rNew.ResultTimeStamp = rOld."ResultTimeStamp"     
     LEFT JOIN public."ObjectResult" rObjRes ON
        rNew.MPRTFk = rObjRes."MPRTFk" AND
        rNew.ResultTimeStamp = rObjRes."ResultTimeStamp"
    WHERE ('x70000000'::bit(32)::integer & rNew.Status = 0)
	AND rNew.ResultTimeStamp = rOld."ResultTimeStamp" ;
END;
$$ LANGUAGE plpgsql;

Based on the following explain plan chunk pruning does not work, as all chunks are scanned:

   Nested Loop Left Join  (cost=0.57..358.39 rows=111923 width=40) (actual time=0.034..0.386 rows=3 loops=1)
          Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status, COALESCE(rold_1."ResultValue"), COALESCE(rold_1."Status"), ("Status" & '-2147483648'::integer)
          Inner Unique: true
          Buffers: shared hit=122
          ->  Hash Left Join  (cost=0.00..0.05 rows=1 width=28) (actual time=0.013..0.017 rows=3 loops=1)
                Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status, "Status"
                Inner Unique: true
                Hash Cond: ((rnew.mprtfk = "MPRTFk") AND (rnew.resulttimestamp = "ResultTimeStamp"))
                ->  Function Scan on pg_catalog.unnest rnew  (cost=0.00..0.05 rows=1 width=24) (actual time=0.006..0.007 rows=3 loops=1)
                      Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status
                      Function Call: unnest('{"(3133427,\"2023-06-01 00:30:00\",123.45,1)","(3133428,\"2023-06-01 00:30:00\",,2)","(1,\"2023-06-01 00:30:00\",678.9,3)"}'::resultudt[])
                      Filter: ((1879048192 & rnew.status) = 0)
                ->  Hash  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=1)
                      Output: "Status", "MPRTFk", "ResultTimeStamp"
                      Buckets: 1024  Batches: 1  Memory Usage: 8kB
                      ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=1)
                            Output: "Status", "MPRTFk", "ResultTimeStamp"
                            One-Time Filter: false
          ->  Append  (cost=0.57..78.25 rows=28 width=24) (actual time=0.119..0.120 rows=1 loops=3)
                Buffers: shared hit=122
                ->  Index Scan using "123_123_PK_Res" on _timescaledb_internal._hyper_89_123_chunk rold_1  (cost=0.57..2.79 rows=1 width=24) (actual time=0.010..0.011 rows=1 loops=3)
                      Output: rold_1."ResultValue", rold_1."Status", rold_1."MPRTFk", rold_1."ResultTimeStamp"
                      Index Cond: ((rold_1."MPRTFk" = rnew.mprtfk) AND (rold_1."ResultTimeStamp" = rnew.resulttimestamp))
                      Buffers: shared hit=14
                ->  Index Scan using "124_124_PK_Res" on _timescaledb_internal._hyper_89_124_chunk rold_2  (cost=0.57..2.79 rows=1 width=24) (actual time=0.009..0.009 rows=0 loops=1)
                      Output: rold_2."ResultValue", rold_2."Status", rold_2."MPRTFk", rold_2."ResultTimeStamp"
                      Index Cond: ((rold_2."MPRTFk" = rnew.mprtfk) AND (rold_2."ResultTimeStamp" = rnew.resulttimestamp))
                      Buffers: shared hit=4
                ->  Index Scan using "125_125_PK_Res" on _timescaledb_internal._hyper_89_125_chunk rold_3  (cost=0.57..2.79 rows=1 width=24) (actual time=0.009..0.009 rows=0 loops=1)
                      Output: rold_3."ResultValue", rold_3."Status", rold_3."MPRTFk", rold_3."ResultTimeStamp"
                      Index Cond: ((rold_3."MPRTFk" = rnew.mprtfk) AND (rold_3."ResultTimeStamp" = rnew.resulttimestamp))

…

As an attempt to make chunk pruning work we changed the function as follows (adding a date range condition in join clause):

CREATE OR REPLACE FUNCTION res_fill_temp(
    _results ResultUDT[]
)
RETURNS TABLE(
    MPRTFk INT,
    ResultTimeStamp TIMESTAMP,
    NewResultValue DOUBLE PRECISION,
    NewStatus INT,
    OldResultValue DOUBLE PRECISION,
    OldStatus INT,
    ObjResStatus INT
) AS $$
DECLARE
    minTimeStamp TIMESTAMP;
    maxTimeStamp TIMESTAMP;
BEGIN
    SELECT MIN(rNew.ResultTimeStamp), MAX(rNew.ResultTimeStamp)
    INTO minTimeStamp, maxTimeStamp
    FROM unnest(_results) AS rNew;

    -- Return all rows from _results with a LEFT JOIN to fill in old values where they exist
    RETURN QUERY
    SELECT
        rNew.MPRTFk,
        rNew.ResultTimeStamp,
        rNew.ResultValue AS NewResultValue,
        rNew.Status AS NewStatus,
        rOld."ResultValue" AS OldResultValue,  
        rOld."Status" AS OldStatus,
        (rObjRes."Status" & 'x80000000'::bit(32)::integer) AS ObjResStatus
    FROM unnest(_results) AS rNew
    LEFT JOIN public."Result" rOld ON
        rNew.MPRTFk = rOld."MPRTFk" AND
        rNew.ResultTimeStamp = rOld."ResultTimeStamp" 
        and rOld."ResultTimeStamp" BETWEEN minTimeStamp AND maxTimeStamp  
    LEFT JOIN public."ObjectResult" rObjRes ON
        rNew.MPRTFk = rObjRes."MPRTFk" AND
        rNew.ResultTimeStamp = rObjRes."ResultTimeStamp"
    WHERE ('x70000000'::bit(32)::integer & rNew.Status = 0);
END;
$$ LANGUAGE plpgsql;

Explain plan now looks like this:

    Nested Loop Left Join  (cost=0.57..2.94 rows=33 width=40) (actual time=0.019..0.031 rows=3 loops=1)
          Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status, COALESCE(rold."ResultValue"), COALESCE(rold."Status"), ("Status" & '-2147483648'::integer)
          Inner Unique: true
          Buffers: shared hit=14
          ->  Hash Left Join  (cost=0.00..0.05 rows=1 width=28) (actual time=0.006..0.008 rows=3 loops=1)
                Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status, "Status"
                Inner Unique: true
                Hash Cond: ((rnew.mprtfk = "MPRTFk") AND (rnew.resulttimestamp = "ResultTimeStamp"))
                ->  Function Scan on pg_catalog.unnest rnew  (cost=0.00..0.05 rows=1 width=24) (actual time=0.003..0.003 rows=3 loops=1)
                      Output: rnew.mprtfk, rnew.resulttimestamp, rnew.resultvalue, rnew.status
                      Function Call: unnest('{"(3133427,\"2023-06-01 00:30:00\",123.45,1)","(3133428,\"2023-06-01 00:30:00\",,2)","(1,\"2023-06-01 00:30:00\",678.9,3)"}'::resultudt[])
                      Filter: ((1879048192 & rnew.status) = 0)
                ->  Hash  (cost=0.00..0.00 rows=0 width=0) (actual time=0.000..0.001 rows=0 loops=1)
                      Output: "Status", "MPRTFk", "ResultTimeStamp"
                      Buckets: 1024  Batches: 1  Memory Usage: 8kB
                      ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.000..0.000 rows=0 loops=1)
                            Output: "Status", "MPRTFk", "ResultTimeStamp"
                            One-Time Filter: false
          ->  Index Scan using "123_123_PK_Res" on _timescaledb_internal._hyper_89_123_chunk rold  (cost=0.57..2.79 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=3)
                Output: rold."ResultValue", rold."Status", rold."MPRTFk", rold."ResultTimeStamp"
                Index Cond: ((rold."MPRTFk" = rnew.mprtfk) AND (rold."ResultTimeStamp" = rnew.resulttimestamp) AND (rold."ResultTimeStamp" >= '2023-06-01 00:30:00'::timestamp without time zone) AND (rold."ResultTimeStamp" <= '2023-06-01 00:30:00'::timestamp without time zone))
                Buffers: shared hit=14

This version in general seems to work faster, however, I can see that ingestion speed is very unstable as it jumps from 500k rows/sec to 60k rows/sec and back… After some time some queries take very long to complete (90s and more) and timeout start to occur. I assume the reason is in locking. It looks like this new range filter solves chunk prunning but results in other issues.

Any help or insights you can provide on this issue would be greatly appreciated. Thank you in advance for your support and guidance!

asiayeah · April 26, 2024, 2:30am

I haven’t looked at all the details.

Have you tried adjustment the chunk size from 1 day to 1 hour? In general, we could reduce the size, so the most recent chunk can fit in memory.

sonny82 · April 26, 2024, 5:55am

Hi @asiayeah,

Thanks for the reply.
In those tests the most recent chunk can fit the memory.
We are executing those tests on a very capable machine, we have sharred_buffers set to 64 GB, and chunk size is currently at about 35 GB.