Understanding backup retention policy

Applies to: VisualSVN Server 3.6 and later

VisualSVN Server offers a retention policy for backups created by scheduled jobs, to set how long you want to keep these backups. Such a policy allows you not to accumulate too many backups, as time goes on and new scheduled backups get created on a regular basis.

A backup retention policy can be set individually for each Backup Repository Job. The policy has just one setting called the Retention period, which defines how long each backup made by this job should be kept. By default, it is set to 30 days, meaning that each backup created by the job is stored for at least 30 days. Once the job detects that a backup is older than the job’s retention period, the job considers this backup expired and automatically deletes it, except for the following 2 cases:

  1. A job that uses the Mixed full/incremental backup scheme will not delete an expired backup that is part of a backup chain, until the last incremental backup in this chain expires. See Ensuring the completeness of a backup chain below.
  2. Regardless of the job's backup scheme, a job will never delete the most recent backup it made (even if this backup has already expired). See Always preserving the most recent backup below.

A job detects and deletes its expired backups only when the job runs to create a new backup. See At which points in time expired backups are deleted below.

Ensuring the completeness of a backup chain

If you selected the Mixed full/incremental backup scheme in a Backup Repository Job, the job creates so-called chains of backup files.

Note
Prior to VisualSVN Server 5.2, repository backup jobs using the Mixed full/incremental backup scheme were called 'incremental backup' jobs.

A backup chain is a sequence of backup files, where the initial backup is a full backup of a repository, and then all the subsequent backups in this chain are incremental backups. This series is called a chain because each incremental backup records only the portion of the repository data that has changed since the previous backup in this chain, meaning that each incremental backup depends on the preceding backups in this chain. One chain continues until the end of this succession of dependent incremental backups. Then the next separate chain will start with a full backup again, and so on.

Since restoring from any incremental backup in a chain actually also requires the preceding part of this chain, starting from the full backup, a mixed full/incremental job will not delete any expired backups that are part of the chain until the last potential incremental backup in this chain expires. After this last incremental backup expires, the job is going to delete the entire chain at once.

Always preserving the most recent backup

Regardless of the job's backup scheme, a repository backup job will not delete the most recent backup it made, even if that most recent backup has already expired. This approach guarantees that you will always have at least one backup to restore from, if the job has made any backups at all.

For mixed full/incremental backup jobs, this means that even if an entire backup chain has expired, this chain will only be deleted if a newer chain exists. This newer chain needs to contain at least the initial full backup.

At which points in time expired backups are deleted

Each Backup Repository Job tracks and deletes only its own backups, i.e. those backups that were created by this particular scheduled backup job.

Once a backup expires, it gets deleted during one of the nearest subsequent runs of its corresponding job: either during the job run that occurs right on the backup's expiration date or during the first run of the job that comes after the backup's expiration date. This job run (that performs the deletion) can be either a normal scheduled run of the job or an on-demand run of the job (manually started by the administrator).

A job run will delete an expired backup only if:

  • this current job run succeeds in creating a newer backup of the same repository;
  • the expired backup does not fall under any of the two above-mentioned exceptions (i.e. is not part of an unexpired chain and is not the only remaining backup).
Note
As opposed to scheduled or on-demand runs of an existing Backup Repository Job, a one-time backup of a repository by definition is not related to any Backup Repository Job. A one-time backup procedure does not detect or delete expired backups created by any jobs.

Example

To illustrate the above-mentioned principles, let us consider a mixed full/incremental backup job that has its schedule and retention policy set up as depicted in the screenshots below.

Setting up a Backup Schedule Setting up a Backup Retention Policy

As you see, the job is set up to create a full backup each Saturday. And it creates incremental backups each weekday (but not during weekends). And the job is set to have a 7-day retention period.

Let us assume that today is Tuesday, June 11th. And let us look at the chain of backup files A0-A5 that this Backup Repository Job has created previously (from June 1 to June 7).

Su Mo Tu We Th Fr Sa
June 1
A0
Full backup
June 2 June 3
A1
Incremental
June 4
A2
Incremental
June 5
A3
Incremental
June 6
A4
Incremental
June 7
A5
Incremental
June 8
B0
Full backup
June 9 June 10
B1
Incremental
June 11
B2
Incremental
June 12 June 13 June 14 June 15
June 16 June 17 June 18 June 19 June 20 June 21 June 22

In the calendar above, backups A0-A5 make up one chain, we will refer to it as Chain A. Where A0 is a full backup and A1-A5 are incremental backups. Also the next chain, Chain B, has been started and by June 11th includes backup files B0-B2, where B0 is a full backup and B1 and B2 are incremental backups.

In what state is Chain A on Tuesday, June 11th? At this point, backups A0 and A1 are definitely older than the configured 7-day retention period, so they have both definitely expired at this point. However, on June 11th, the job cannot not yet safely delete backups A0 and A1, because deleting them would make all the later backups in Chain A (namely, A2-A5) impossible to restore from and thus useless. This is not possible, because, say, A4 and A5 have definitely not expired yet and should still work as valid restoration points. For this reason, the job preserves even the expired backups in Chain A, until it detects that the chain's last backup file—A5—has expired (and thus the whole Chain A has expired). At that point, the job can, and will, safely automatically delete all the files in Chain A at once, under the retention policy.

This job with daily scheduled runs (except for Sunday) can detect that A5 has expired (i.e. is more than 7 days old) either on the 7th, or on the 8th day since the A5 file was created, as described above in the section called At which points in time expired backups are deleted. So, in our example, the job will detect that A5 has expired and will delete Chain A on June 14th or June 15th.

Last Modified: