So, let's drill down into those daily and weekly totals. Usually, a nightly backup is an incremental backup and accounts for 10% of your total data store, assuming a 10% data change per day rule of thumb.
Let's say a full backup is 20 TB. A daily incremental backup of 2 TB over a 12-hour backup window -- between 1800 and 0800 -- would require a sustained data transfer rate of 50 MB every second (171 GB/hour) for 2 TB of data in its raw format.
If we assume a weekend full backup can run all day Saturday and Sunday, this equates to 20 TB over a 48-hour backup window and a sustained transfer rate of 125 MB every second (427 GB/hour) for the full data in its raw format.
If your ISP connection guarantees you 171 GB/hour capacity between 1800 and 0800 each weekday, and 427 GB every hour on Saturdays and Sunday, then an offsite backup service may work for you. Don't forget to factor in yearly growth of data, as these figures will grow over time.
Remote data backup service traffic will share the ISP connection with other IP activity in your organisation, so to guarantee backup throughput QoS guarantees may be required. Can your offsite backup service supplier guarantee the required bandwidth over the Internet? Recovery will also need to be considered to ensure IP bandwidth is available during the working day to facilitate single file restores as well as full system recovery.
The throughputs stated above apply to a service where you're sending data offsite from your own infrastructure. Some suppliers offer on-site appliances that replicate to a remote location and this can enable replication to the remote site to be spread over 24 hours, thus reducing the required throughput.
Remote data backup services and data deduplication
Remote data backup services may also offer data deduplication technologies to minimise data transfer volumes, as well as reduce bandwidth requirements and the time needed to perform backups. Data deduplication can be performed by an on-site appliance or the host where the data resides. In the latter case there will be CPU and memory overhead, so ensure your systems have the capacity for this.
Bear in mind that data deduplication will take some time to run. This may cause a delay to the start of the backup, but should allow the backup to complete in a shorter timeframe. The same delays and overheads should also be considered when reconstructing deduplicated data during a restore.
A service provider should provide encryption to ensure your backup data is adequately protected as it passes over the Internet and while it is held by your remote backup supplier.
If you currently use backup agents or protocols, make sure your offsite backup service supplier supports equivalents.
Ensure disaster recovery (DR) requirements are not compromised and that recovery point objectives (RPO) and recovery time objectives (RTO) are possible with the available bandwidth. A DR test is always a good idea when there is significant infrastructure change, so it's worth doing in a case such as this to ensure DR processes and procedures are in line with the new backup offering.
Finally, if and when you want to opt out, be sure you know how to obtain a copy of your long-term retention backups for regulatory purposes. Make sure you know the cost for this service, and on what format will the backups be provided.
This was first published in April 2010