Right now the hander process happily writes all deltas received for the tables to backup and the archiver than moves that data to the final storage. That creates unnecessary writes in the case the replication slot is far behind the current WAL position on the primary. Consider the following case:
-
the latest transaction that touched table A has the final LSN of 10/42.
-
the up-to-date backup has been triggered after the certain amount of delta files. The table to backup is at the LSNs of 11/B0 (the logical slot we create in the backup transaction gives us this number) at the primary, so all changes before than LSN are already applied to the SQL dump we are going to take.
-
the next transaction fetched from the 'deltas' slot that touches table A has the final LSN 10/A00.
Right now we will happily write this transaction and all subsequent one, however, from 10/A00 until 11/B0 there are around 4GB of data with the changes for table A that we don't need to write or process, since the up-to-date backup of table A already includes the result of all those changes.
I'd propose to skip writing those changes in the first place and remove the changes already written instead of archiving them in the archiver. Together with the code that purges old deltas when the backup is ready that would give us a reasonable guarantee that we don't leave the old data hanging around for longer than necessary.
As a side note, when removing old deltas after the base backup is complete we need to purge both temp and final directories of all deltas before basebackupLSN (with the exception of the file being written to that should be handled by the logical backup worker instead of the base backup worker.).
Right now the hander process happily writes all deltas received for the tables to backup and the archiver than moves that data to the final storage. That creates unnecessary writes in the case the replication slot is far behind the current WAL position on the primary. Consider the following case:
the latest transaction that touched table A has the final LSN of 10/42.
the up-to-date backup has been triggered after the certain amount of delta files. The table to backup is at the LSNs of 11/B0 (the logical slot we create in the backup transaction gives us this number) at the primary, so all changes before than LSN are already applied to the SQL dump we are going to take.
the next transaction fetched from the 'deltas' slot that touches table A has the final LSN 10/A00.
Right now we will happily write this transaction and all subsequent one, however, from 10/A00 until 11/B0 there are around 4GB of data with the changes for table A that we don't need to write or process, since the up-to-date backup of table A already includes the result of all those changes.
I'd propose to skip writing those changes in the first place and remove the changes already written instead of archiving them in the archiver. Together with the code that purges old deltas when the backup is ready that would give us a reasonable guarantee that we don't leave the old data hanging around for longer than necessary.
As a side note, when removing old deltas after the base backup is complete we need to purge both temp and final directories of all deltas before basebackupLSN (with the exception of the file being written to that should be handled by the logical backup worker instead of the base backup worker.).