User reports dedupe ratios of up to 450:1 at Storage Expo

Associated Newspapers and CB Richard Ellis make significant deduplication gains using Data Domain and FalconStor data deduplication products

Users at Storage Expo in London this week reported achieving huge data reduction ratios using data deduplication, with international property advisor CB Richard Ellis achieving up to 450:1 while another user, Associated Newspapers, has clocked up to 60:1.

Both users also reported much improved backup and restore times and recovery points as well as being able to free up storage arrays for online usage. The gains were made despite failing to shop around in the data deduplication market.

With our Oracle data we're getting lower ratios than you'd expect. But, we're now able to keep five nights of backups instead of two.
Steve Bruck
infrastructure architectAssociated Newspapers
Associated Newspapers is the London-based publisher of the Daily Mail and a number of regional print publications as well as online news sources. As a 24-hour operation, the organisation had suffered immense problems with backup windows because the schedules of newspaper production run throughout the day, with different editions and publications overlapping each other.

As part of a virtualisation project which saw it reduce the number of its physical servers from more than 200 to 13 with 228 VMWare guests, Associated Newspapers also established two Data Domain DD580 appliances as backup targets, one replicating to the other at a remote site. Initially using the Data Domain appliances as targets for snapshots of its virtual machines, the company has achieved data deduplication ratios of up to 60:1, with 25:1 being the average.

"Virtualisation gave us the chance to cut out our pain with backups," said Steve Bruck, infrastructure architect with Associated Newspapers. "We've avoided having large numbers of backup agents and have taken snapshots of the VMWare servers instead. They are storage-hungry and not conducive to being stored on tape. Using our EMC NAS/SAN arrays would not have been cost-efficient."

The organisation has since gone on perform deduplication on its Oracle databases and has achieved deduplication ratios of 5:1 to 15:1. "Data deduplication works best when data is relatively static," Bruck said. "With our Oracle data, there are lots of changes on a daily basis, so we're getting lower ratios than you'd expect. But we're now able to keep five nights of backups on disk instead of two."

International property advisor CB Richard Ellis reported gaining data deduplication ratios of 450:1 through FalconStor software on data types that include Microsoft Word, PowerPoint, Excel and other Office applications.

The firm has 33,700 employees at 400 offices in 57 countries with two main data centres in London and Madrid. While such a data deduplication ratio is unusually high, it illustrates that businesses can achieve impressive results with the technology if circumstances are favourable.

According to Clive Longbottom, service director with analyst group Quocirca, "This is a much larger than usual deduplication ratio than you usually hear about, but it goes to show that if you are coming from chaos and are deduplicating large amounts of data, you can achieve massive data reduction. Deduplication works by eliminating redundancy, so if you have a lot of instances of data of similar type, the deduplication engine can eliminate lots of it over time."

Alex Gomes, IT manager with CB Richard Ellis is impressed with the effects on restore times. "Before going to FalconStor, if we had to restore data from two months ago we would have to get the tape from a third party," he said. "By the time we had got the data back, the user would have beaten us to it by getting the person to send them the email again. Now we get data back in a few clicks. We can store more in less space, using less power and for less cost."

Both users achieved their impressive results despite taking the first data deduplication product they were offered. "We didn't trial anything else," said Bruck. "We found something and it worked with the ability to handle the quantity of data at a speed that was more than adequate."

Gomes said, "We couldn't see that anything else would do any better."

Experts usually advise users to trial a number of data deduplication products because vendors use different algorithms and other approaches to deduplication which can have varying outcomes when fed with different data types.

Read more on Data protection, backup and archiving