Unless your backup retention policies dictate that you only need to keep backups for a few weeks, you are still likely to need a tape archive for long-term storage of important data, such as monthly or year-end data sets.
However, more and more clients I work with are considering retaining their daily and weekly backups (which typically only need to be kept for two to four weeks) on disk using data deduplication. This is particularly attractive if you can replicate the dedupe images between intelligent devices across data centres for disaster recovery (DR) purposes.
We can perform data deduplication in several different places nowadays. Just about all new virtual tape library (VTL) devices feature dedupe functionality, and backup software vendors are increasingly incorporating it into their backup applications (either at the client or at the backup server). You also have the option of data deduplication for primary storage with certain network-attached storage (NAS) solutions. Backup data deduplication solutions analyse and process the data stream between the backup server or client and the target storage device, eliminating duplicated data patterns before they are stored.
Data deduplication has the potential to reduce the size of backup disk volumes from 20% to 95%, but some careful analysis is needed at the scoping and sizing phases to ensure that your backup data will dedupe. Some image or file pre-compression utilities, for example, will create data sets which cannot be reduced in size and are not good candidates for data deduplication.
Depending on the age of your existing virtual tape library, it may be time to think about an upgrade or replacement to help make your backup life easier.
For more on data deduplication:
1. Using data deduplication in ROBOs
3. The impact of data deduplication on backup
Related Q&A from Ian Lock
Hitachi replication: Find out how to do replication for HDS arrays, as well as which replication method is best suited for AIX servers.continue reading
Backup best practices: Is it best to use a dedicated backup server or a shared server that also runs
Backup servers are responsible for scheduling thousands of backup jobs per day and committing thousands of associated backup details to the database....continue reading