After upgrading to pg16.4-ts2.16.1-all version we get this error both on prod and on staging on large and small clusters regardless of use patterns or configigarion settings. We have clusters with 3, 4, 5 or 6 nodes and cluster CPU ranges from 2 to 50+ vCPU. Data volumes range from 100GB to 4TB and we have lots of WAL spaces (4TB!) with a very high settings for WAL retention (256GB max_wal_size, 128GB wal_keep_size). Almost all checkpoints are triggered by time.
Error is this:
2025-02-04 08:57:25 UTC [1226633]: [67a1d675.12b789-2] @,app= [08P01] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0000001600000ABB00000045 has already been removed
The error happens every time a leader election is triggered by a rolling restart of statefulset’s pods in kubernetes. How can this be debugged further? Is this a known issue? Is there maybe a fix?
Example logs from the leader pod aren’t very helpful:
Feb 2, 2025 @ 10:39:37.585 timescaledb-analytics-5 2025-02-02 09:39:37,585 - WARNING - MainThread - Received kill 2, shutting down
Feb 2, 2025 @ 10:39:37.585 timescaledb-analytics-5 2025-02-02 09:39:37,585 - WARNING - history - Shutting down thread
Feb 2, 2025 @ 10:22:06.623 timescaledb-analytics-5 2025-02-02 09:22:06,623 - INFO - history - Refreshing backup history using pgbackrest
Feb 2, 2025 @ 09:22:03.812 timescaledb-analytics-5 2025-02-02 08:22:03,811 - INFO - history - Refreshing backup history using pgbackrest
We have metrics for disk (IOPS/throughput), network bandwidth use and packets sent, CPU, RAM: none of those were close to max
Hi Jan, I’m a Community Manager, and I overlook all the issues, and that looks like a new issue for me. Have you tried upgrading any of it to the latest pg17 with the latest 2.18?
It seems more related to Postgresql than any timescale specifics.
Googling around I found this:
If you use streaming replication without file-based continuous archiving, the server might recycle old WAL segments before the standby has received them. If this occurs, the standby will need to be reinitialized from a new base backup. You can avoid this by setting wal_keep_size to a value large enough to ensure that WAL segments are not recycled too early, or by configuring a replication slot for the standby. If you set up a WAL archive that’s accessible from the standby, these solutions are not required, since the standby can always use the archive to catch up provided it retains enough segments.
wal_keep_segments is a deprecated setting. I don’t think it even exists in postgres 16. There’s wal_keep_size and we do set it to 128GB. This should be enough as we write about 50 GB/hour of WAL files if a replica is expected to sync but isn’t yet. At the same time, a restart takes about 12-15 minutes so 128GB in this case should be more than enough?
We also have archive replication enabled which makes this error even more strange
Upgrading to 17 would be last resort for us as our main cluster has 4TB of data and is hosted in kubernetes. It takes a lot of time to replicate data to a new cluster and this means a long downtime and a risk of failure