Telegraf metrics to TimescaleDB

Hi, I have version 2.11.1 and PostgreSQL 15. I want to collect metrics using Telegraf. I set chunk_time_interval to 7 days, and after seven days the metrics stopped being written, the Telegraf sessions are permanently in COPY waiting state and I see deadlocks in the log.
I tried lowering the chunk_time_interval to 24 hours, and it’s the same. At midnight, no more chunk is created and writing to the database stops

It seems like Telegraf is holding the connections. Anything interesting in the logs to share? Maybe background workers are stopping or something?

Hello, log shows something like this:

2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf LOG:  process 1418 detected deadlock while waiting for ShareUpdateExclusiveLock on relation 20177 of database 19124 after 300001.816 ms
2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf DETAIL:  Processes holding the lock: 1390, 1418. Wait queue: 1430, 1444, 1465, 1489, 1604, 1619, 1627, 1633, 1655, 1661, 1686, 1693, 1707, 1733, 1742, 1747, 1765, 1892, 1955, 1971, 1993, 1995, 1998, 1416, 1601, 1673, 1754, 1678, 1421, 1596, 31404, 21374, 28739, 16130, 18009, 9394, 23353.
2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf CONTEXT:  COPY mem, line 1
2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf STATEMENT:  copy "public"."mem" ( "time", "tag_id", "available_percent", "cached", "huge_pages_free", "huge_page_size", "sunreclaim", "swap_cached", "available", "active", "buffered", "commit_limit", "slab", "vmalloc_total", "used_percent", "dirty", "write_back", "swap_free", "free", "high_total", "low_free", "mapped", "sreclaimable", "vmalloc_chunk", "write_back_tmp", "committed_as", "high_free", "huge_pages_total", "shared", "swap_total", "used", "page_tables", "total", "inactive", "low_total", "vmalloc_used" ) from stdin binary;
2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf ERROR:  deadlock detected
2023-08-29 00:01:39.404 CEST [1418] telegraf@telegraf DETAIL:  Process 1418 waits for ShareUpdateExclusiveLock on relation 20177 of database 19124; blocked by process 1390.
        Process 1390 waits for ShareUpdateExclusiveLock on relation 23306 of database 19124; blocked by process 1405.
        Process 1405 waits for ShareUpdateExclusiveLock on relation 20363 of database 19124; blocked by process 1406.
        Process 1406 waits for ShareUpdateExclusiveLock on relation 20425 of database 19124; blocked by process 7607.
        Process 7607 waits for ShareUpdateExclusiveLock on relation 20334 of database 19124; blocked by process 1422.
        Process 1422 waits for ShareUpdateExclusiveLock on relation 19966 of database 19124; blocked by process 1418.

seems like DB server cannot handle new chunks of hypertables after chunk time interval, because of continuous writting by telegraf clients. I’d like to find out if this has been dealt with before. Maybe it’s a bug in postgresl.output, maybe it’s in the server settings?

It seems you have too many competition for the same resources. It does not seem related to timescale itself but some configuration that triggers up parallel processing.

How is the metrics being collected?