Q

What are the pros and cons of inline and post process data deduplication?

Chris Reid, consultant with Morse, looks at the pros and cons of inline and post-process data deduplication.

What are the pros and cons of inline and post process data deduplication?
The main factor when choosing de-duplication technologies comes down to a simple choice of speed versus space. In-line de-duplication requires the least space because excess data is stripped away as it arrives on the system. With current technology (whether software, hardware or both) this is a slower process, which impacts backup windows. Naturally, as computing technology continues to advance, this may become less of an issue as long as the technology can keep ahead of the growth of information.

Post process de-duplication is faster but requires much more space. It involves taking a disk backup and then de-duplicating the data. This places less strain on the system during the task and doesn't impact backup windows. It also ensures that there is at least one full copy of the last backup on disk, which can aid in data restoration. However, this method also requires significantly more space to be available on the disk or virtual tape library for the process to work. As a result, post process de-duplication lends itself quite well to integrating into existing backup environments, where there is likely to be a mix of tape and disk, as there is minimal impact to the existing environment.

Whichever form of de-duplication an organisation uses, it is not the be all and end all of an efficient backup system. As unstructured data continues to spawn, organisations must also ensure that their systems are able to index, search and retrieve backed up data correctly and efficiently – otherwise de-duplication is simply switching 1,000,000 pieces of random data for 100,000; while the latter is better, it is still far from ideal.

 

This was last published in October 2008

Read more on Data protection, backup and archiving

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchCIO

SearchSecurity

SearchNetworking

SearchDataCenter

  • How do I size a UPS unit?

    Your data center UPS sizing needs are dependent on a variety of factors. Develop configurations and determine the estimated UPS ...

  • How to enhance FTP server security

    If you still use FTP servers in your organization, use IP address whitelists, login restrictions and data encryption -- and just ...

  • 3 ways to approach cloud bursting

    With different cloud bursting techniques and tools from Amazon, Zerto, VMware and Oracle, admins can bolster cloud connections ...

SearchDataManagement

Close