Backup archives migration to Borg

Last weekend I found a number of encrypted hard-drives that were used to do periodic backups from 2006 to 2014. At the time, the backups were made using rsync with hard-links to save only one occurrence of the file if it was not changed since the last backup.

I wanted to check that everything was still there and upgrade this to a Borg repository so that I can profit from compression and deduplication to reduce these backup size further down and store them in a more secure way.

Check the backups

The backups were made using hard-links with one backup corresponding to one folder as follow :

$ ls backups/feronia/
back-2014-06-19T19:05:10/  back-2014-10-10T07:30:00/
back-2014-12-24T14:34:44/  current@

To check that the backups were still readable, I listed the content of the different folders and checked that some known configuration files were present and matched what was expected. This worked until I processed some backups done before I was using awesomewm, when I changed a lot of config files to match my usage instead of using the default ones.

All in all, the backups were still good and readable, I could use these as a basis for the transition to a more robust and space-efficient system.

I saw a number of freezes during the check I interpreted as signs of old age for the spinning rust.

Initialize the Borg backup

The first step is to initialize the borg repository. We will put it on one of the known good backup drive that still has some room. To estimate the space needed for the backups, I took the size of the most recent backup and multiplied by two, as I know that I did not delete a lot of files and that the deduplication will reduce the size of the old backups that contained a lot of checked-out subversion repositories.

So, with a destination for my borg repository, I created a folder on the disk and gave my user read-write rights on this folder.

$ sudo mkdir backups/borg-feronia
$ sudo chown 1000:1000 backups/borg-feronia -R

Then, the creation of the repository with borg :

$ borg init --encryption=repokey backups/borg-feronia
Enter new passphrase: 
Enter same passphrase again: 
Do you want your passphrase to be displayed for verification? [yN]: n
[...]

I decided to use the repokey encryption mode. This mode stores the key in the repository, allowing me to only remember the passphrase and not having to worry about backuping the key file.

Transfer the existing backups to Borg

The borg repository has been initialized, we can now start migrating the backups from the hard-linked folders into borg.

As borg does not care about hard-links, we can simply loop over the different folders and create a new archive from it. It will take some time because in each directory it will loop over the whole content, hash it, check whether it changed, deduplicate it, compress it and then write it. Each backup of approximately 70 GiB took one hour to migrate on my computer. It seems that the process is limited by the single-thread performance of your CPU.

$ export BORG_PASSPHRASE=asdf
$ for i in back*; do \
    archivename=$(echo $i | cut -c 6-15); \
    pushd $i; \
    borg create --stats --progress ~/backups/borg-feronia::$archivename .; \
    popd; \
    done;

The env variable will allow us to walk away at this stage and let the computer do its magic for some hours.

Check the migrated backups

Once the backups have been migrated, we need to check that everything is in order before doing anything else.

I did the same as before, using this time borg list and borg extract to check whether the files are present and their content is correct.

Archive these backups

Once the migrated backups have been tested, we can shred the old hard drives that were showing signs of old age.

Since storage is so cheap nowadays, I will also transfer an archive of the Borg backup folder to an online storage service as to be able to retrieve it in case the local storage supports are destroyed or otherwise unreadable in the future.

I choose to simply create a tar archive of the Borg folder and upload it to AWS S3 since these backups will not be updated. Perhaps some day I will add the more recent backups to this setup but for now they are a read-only window into the laptop I had during my studies and during my first jobs.