Is there any procedure documented to take incremental backup in multi node environment?

Mohammed_Iliyas_pate · May 10, 2022, 8:08am

Is there a documented procedure for taking incremental backups in a multi-node (self-hosted distributed hyper table ) environment?

If not, could you please advise standard steps? Also, from what I understand, backups should be taken for each access node and data node.

Please make suggestions.

ryanbooz · May 11, 2022, 2:38pm

@Mohammed_Iliyas_pate,

TimescaleDB has a function for distributed hypertables called create_distributed_restore_point() which can be used in connection with your backup procedures of all data nodes (and access node) to ensure you have a consistent point to restore all nodes to in the event of a failure.

Mohammed_Iliyas_pate · May 13, 2022, 12:01pm

Thanks, @ryanbooz I am able to generate a backup file using create_distributed_restore_point(), but the file is always getting generated under the pg_wal folder even though I have configured archive_command in postgresql.config

archive_command = ‘cp %p \%f’

Observation: The generated file gets copied into the archived folder through the command select pg_switch_wal(), but create_distributed_restore_point does copy the generated file to an archive folder.

naveenpotlapally · June 8, 2022, 8:55am

@ryanbooz Thanks for giving solution on this thread, Actually I have a doubt on backup and restore/recovery that is
once I create a distributed restore point and take a base backup And at later point of time may be after some days if I would like to restore and recover the database on some other host until the restore point.
Here how will a dba will identify or know the restore point name from backup to recover. appreciate if you can help me with this!!

Mohammed_Iliyas_pate · June 15, 2022, 12:27pm

@naveenpotlapally : I have documented the complete procedure to take incremental backup and restore.

@ryanbooz @LorraineP : Feel free to correct if anything is missed/improved.

Stop Postgres service
Modify config
wal_level = replica
archive_mode = on
archive_command = ‘copy %p C:\ProgramData\archivedir\%f’
archive_timeout = 600 [ defined wal archive duration]
Create 2 directories : basebackup and WAL Archieve
Start Postgres
Verify: select pg_switch_wal() #it will create incremental wal archieve
Create base Backup:
pg_basebackup -D C:\ProgramData\basebackup\ -Ft -P
Execute command to note time and number of records:
a. postgres=# select now();
now

2022-05-11 13:28:34.24433-07
(1 row)

b. postgres=# select count(*) from locate;
count

(1 row)
8. Create check points,
select pg_switch_wal()
9. Stop postgres.

Rename Postgres data folder and create an empty data folder.
Extract Base Backup
tar -xvf C:\ProgramData\basebackup\base.tar -C C:\ProgramData\PostgreSQLData\
Create recovery.signal file under data file and add below contents,
restore_command=‘copy C:\ProgramData\archivedir\%f %p’
recovery_target_time = ‘2022-05-11 13:25:26’ #Wal archieve time
Update postgresdata recovery flags,
restore_command = ‘copy C:\ProgramData\archivedir\%f %p’
recovery_target_time = ‘2022-05-11 13:25:26’ #Wal archieve time
Start Postgres – It restores base backup + Wal archieve
Still database will be in recovery mode, hence run the below command to move out of recovery mode.
Select pg_wal_replay_resume();

Thank you.

Is there any procedure documented to take incremental backup in multi node environment?

b. postgres=# select count(*) from locate; count

b. postgres=# select count(*) from locate;
count