Real-time or near-real-time snapshots, replication and continuous data protection (CDP) have become common methods of protecting data in the datacentre.
But traditional backup, where copies of data or changes to data are made at regular intervals, is still very much a mainstream approach to data protection.
Here we run through the main backup types available – full, differential, incremental, and hybrids created from these such as synthetic and incremental-forever – and where appropriate discuss their pros and cons.
This is the most fundamental type of backup available, and is where a copy is made of all data in a specified dataset. Clearly, it is also the most time consuming to create and takes up the most storage capacity. On the plus side, it can be easier to restore data from a full backup than from some other types that must be recreated from sets of changed data.
With a full backup already completed, once a week, for example, incremental backups copy only data changed since the last (full or incremental) backup. The advantage is that this is the least time and storage space-consuming method of backup. The fly in the ointment is that to restore data you must reconstruct it from the last full backup plus all intervening incrementals.
Also building on a regular full backup, a differential backup taken daily, for example, makes a copy of all changed data since the last full backup. To restore, you therefore need the last full backup plus the latest differential. The advantage of differential backups is that restores are easier than with a full-plus-incremental backup regime, while the drawback can be that daily differentials are likely to be of greater volume and more time consuming than incremental backups.
Synthetic full backup
A synthetic backup takes a full backup and combines subsequent incremental backups with it to provide a full backup that is always up to date. Synthetic full backups have the advantage of being easy to restore from while also being easy on bandwidth across the network as only changes are transmitted. That said, there’s a processing overhead at the backup server in excess of that incurred by a simple incremental, but that shouldn’t be too onerous.
Something of a hybrid between incremental-plus-full backup and synthetic, incremental-forever is based on a full backup with incrementals subsequently taken. These are retained separately and can be restored as if it was an up-to-date full at the point in time required. The theory is that you need never take a full backup again and can restore to any restore point. This is the method used by IBM Tivoli Storage Manager, in which it is called a progressive incremental. As with any approach based on incremental changes, there is the least possible hit to bandwidth and capacity in day-to-day terms.
Reverse incremental backup
This is when synthetic full backups are the normal mode of operation, but previous incrementals are kept and can be rolled back to as restore points prior to the latest full backup. This is used by, for example, virtual machine backup specialist Veeam. The same supplier also has so-called forward incremental backup, in which an initial full backup is followed by incrementals. These are combined into a synthetic at regular intervals, with the incrementals between synthetics kept to allow restore points.