Insert Performance Distributed Hypertable

Hello,
I’m currently experimenting with TimescaleDB as Docker container. I’m using TS in a self hosted Docker in version 2.10.1. I have a cluster of 1 Access Node and 2 Data Nodes (1Gbit LAN in between). Until now I just used the access node to store data to. I have 6 tables, each has around 10-30 columns. Only Integer, Float, Doubles, Timestamps and String are used.
I’m parsing files and write the contained data to TimescaleDB. With only using the access node and simple hypertable I was able to process the file in ~40sec, writing all entries (~800k) into database.

Now I tried to set up a distributed hypertable with the 2 data nodes. I’m using a replication of “1” and added a space dimension for each table, so not all data gets written to one data node. I would have imagine, that should possible even speed-up the insert. But the insert performance is drastically slower!
Processing the same file lasts here ~10min…

I configured the data/access nodes according to Timescale Documentation | Multi-node configuration.

Would be nice to get some help or tips for improving multi-node. Currently it’s not really usable for us.
If you need some more information, I will provide everything needed.

Thank you!

Are there some key topics which have to be considered when using distributed hypertables in respect to insert performance?

I just tried the tsdbperf application to write data from my local machine to the access node only:

./tsdbperf-x86_64-unknown-linux-gnu --db-host tsdb_access --db-user postgres --workers 10
[2023-07-24T08:59:43Z INFO tsdbperf] Number of workers: 10
[2023-07-24T08:59:43Z INFO tsdbperf] Devices per worker: 10
[2023-07-24T08:59:43Z INFO tsdbperf] Metrics per device: 10
[2023-07-24T08:59:43Z INFO tsdbperf] Measurements per device: 100000
[2023-07-24T09:00:01Z INFO tsdbperf] Wrote 10000000 measurements in 17.65 seconds
[2023-07-24T09:00:01Z INFO tsdbperf] Wrote 566476 measurements per second
[2023-07-24T09:00:01Z INFO tsdbperf] Wrote 5664760 metrics per second

Does anybody has some tips to getting this fixed? Or at least some hints, what could be a possible solution?

Have you tried a single-node database instance? Is there any specific reason to go for distributed hypertables? In general, a distributed hypertable is slower than a plain single-node instance due to additional planning overhead and multiple network hops.

Thanks for your reply.
The reason was to later on also do replication and I thought that a distribution would also increase throughput, because of distributing workload to different servers if there is no replication.
Is there a better solution for replication? HAProxy

Currently I’m using my access node more or less as single database instance, because of no distributed hypertable.

Distributed hypertables can offer better query performance due to parallelizing queries, but that is also only true if more than one node holds required data. That is only true though, if the query time outweighs the additional overhead due to distribution.

Replication and HA has nothing to do with distributed hypertables. For that you may want to employ PgBouncer or PgPool II, and PGs own main, replica solution (physical replication).