Q

Data deduplication technology considerations for ROBOs

Data deduplication technology can help remote offices and branch offices (ROBOs) address bandwidth restrictions. Learn the factors to consider before adopting data deduplication technology.

Can data deduplication technology help us get around bandwidth restrictions on data backups at our remote office? What considerations should be made before adopting data deduplication?

Source deduplication can reduce the total amount of stored data, and thus sidestep the need for expensive, high-bandwidth inter-site links to replicate data.

To avoid potential pitfalls, however, consideration should be given to the following points before adopting any type of data deduplication technology in remote offices and branch offices (ROBOs):

 

  • Consider the type of data: Data that has already been compressed typically does not yield the best data deduplication ratio. Structured files typically contain a good deal of redundant data, and as a result are an excellent candidate for deduplication. Unstructured data consisting of many unique files -- images, for example -- will not achieve such good data reduction ratios.
  • Consider the load on clients: Source-based data deduplication incurs a performance penalty on the host because it occurs before the backup takes place. The impact of the processing overhead is determined by the size of the data set (the smaller the data set, the smaller the overhead). Simply put, the larger the group of files that need to be checked for uniqueness, the longer and more intense the processing requirements on the host.
  • Restore considerations: Restoring deduplicated data requires 'rehydration' of the data, which incurs an overhead in the restore time. Whilst this overhead is negligible in single file restores, in a situation that involves full site recovery (and hence a large amount of data), this overhead will grow accordingly.
  • Cost is always a consideration: While the introduction of data deduplication in this case is driven by the need to reduce the necessary bandwidth (and associated costs) between sites, careful consideration is needed to ensure that the correct technology is chosen to maximise the return on investment. So be certain of the business case for deduplication before proceeding.

For more on data deduplication technology:

1. Learn about data deduplication for primary storage.

2. Find out how to control data growth with target-based data deduplication.

3. Discover the difference between source and target dedupe.

This was last published in April 2010

Read more on Data protection, backup and archiving

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchCIO

SearchSecurity

SearchNetworking

SearchDataCenter

  • How do I size a UPS unit?

    Your data center UPS sizing needs are dependent on a variety of factors. Develop configurations and determine the estimated UPS ...

  • How to enhance FTP server security

    If you still use FTP servers in your organization, use IP address whitelists, login restrictions and data encryption -- and just ...

  • 3 ways to approach cloud bursting

    With different cloud bursting techniques and tools from Amazon, Zerto, VMware and Oracle, admins can bolster cloud connections ...

SearchDataManagement

Close