Data deduplication technology considerations for ROBOs

Data deduplication technology can help remote offices and branch offices (ROBOs) address bandwidth restrictions. Learn the factors to consider before adopting data deduplication technology.

Can data deduplication technology help us get around bandwidth restrictions on data backups at our remote office? What considerations should be made before adopting data deduplication?

Source deduplication can reduce the total amount of stored data, and thus sidestep the need for expensive, high-bandwidth inter-site links to replicate data.

To avoid potential pitfalls, however, consideration should be given to the following points before adopting any type of data deduplication technology in remote offices and branch offices (ROBOs):


  • Consider the type of data: Data that has already been compressed typically does not yield the best data deduplication ratio. Structured files typically contain a good deal of redundant data, and as a result are an excellent candidate for deduplication. Unstructured data consisting of many unique files -- images, for example -- will not achieve such good data reduction ratios.
  • Consider the load on clients: Source-based data deduplication incurs a performance penalty on the host because it occurs before the backup takes place. The impact of the processing overhead is determined by the size of the data set (the smaller the data set, the smaller the overhead). Simply put, the larger the group of files that need to be checked for uniqueness, the longer and more intense the processing requirements on the host.
  • Restore considerations: Restoring deduplicated data requires 'rehydration' of the data, which incurs an overhead in the restore time. Whilst this overhead is negligible in single file restores, in a situation that involves full site recovery (and hence a large amount of data), this overhead will grow accordingly.
  • Cost is always a consideration: While the introduction of data deduplication in this case is driven by the need to reduce the necessary bandwidth (and associated costs) between sites, careful consideration is needed to ensure that the correct technology is chosen to maximise the return on investment. So be certain of the business case for deduplication before proceeding.

For more on data deduplication technology:

1. Learn about data deduplication for primary storage.

2. Find out how to control data growth with target-based data deduplication.

3. Discover the difference between source and target dedupe.

Read more on Data protection, backup and archiving