NetApp adds storage deduplication to NearStore

According to a white paper quietly released by NetApp, the company has deduplication in beta tests -- but it won't work with most of its products yet, including snapshots.

According to a white paper released this week (but dated 26 February), Network Appliance (NetApp) has deduplication, but it's still limited in terms of the products it supports, the size of data stores it can dededuplicate and has the potential for high overhead.

The company has ported its single-instancing algorithm, which it calls Advanced Single Instance Storage (A-SIS), from its SnapLock content-addressed storage (CAS) product into FlexVols to deduplicate data at the block level.

According to the white paper, "A-SIS only stores unique data blocks in the flexible volume and creates a small amount of additional metadata in the process." Each block of data has a digital signature, which is compared to all other signatures in the flexible volume. If an exact byte-for-byte block match exists on the flexible volume, the duplicate block is discarded and its disk space is reclaimed."

More on NetApp
NetApp stokes midrange fires with new array

EMC, NetApp execs debate future of iSCSI

IBM, NetApp get cozier with NAS gateway

NetApp certifies email archiving appliacne
The white paper also claims that the post-process deduplication has a 1% write performance hit. The background process, which is activated through a command line interface, can also be scheduled or run manually. A-SIS operates on the active file system (AFS) of a flexible volume.

The product is currently in beta tests and has not yet been released to the public. Though the white paper contains instructions for deploying A-SIS with NearStore, it requires two licenses, called "nearstore_asis2" and "nearstore_option" to be activated on the filer.

A-SIS won't work with snapshots, LUNs and limited in scale

There are also a few catches at this phase of the product as detailed by the white paper: Any block referenced by a snapshot copy cannot be deduplicated, A-SIS will only work on data sent via CIFS or NFS, it will not work on LUNs and is only compatible as yet with the NearStore R200, FAS3020c and FAS3050c. A-SIS also cannot deduplicate across FlexVols, which currently have a size limit of 4 terabytes (TB) on the R200, 2 TB on the FAS3020c and 1 TB on the FAS3050c.

The white paper also warns, "The total storage used by A-SIS is …1% to 3% of the actual stored data due to fingerprints in the fingerprint file and change log file(s). So for 1 TB of data there would be 10 GB to 30 GB of overhead." That's without snapshots -- if snapshots are turned on for the flexible volume, the paper states, "the overhead becomes additive each time A-SIS is run and is therefore substantial."

Finally, under best practices, the white paper suggests that users "run A-SIS infrequently … do not run eight A-SIS processes concurrently if possible because there will be a negative performance impact on other applications."

It continues, "given the above two items, the best bet is to disable any A-SIS schedules on the flexible volume and run A-SIS manually, [and] turn off scheduled Snapshot copies or keep Snapshot copies to a minimum ... if Snapshot copies are required, run A-SIS before creating the Snapshot copy as this will minimize the amount of data that gets locked in Snapshot copies."

"This pretty much looks like an SMB [small and midsized business] type play where the backup window finishes, and you have fairly large amounts of time to dedupe data already backed up," said Jerome Wendt, lead analyst and president with the DCIG Inc. "You might free up space in the course of a day with it, but you still need all that space for your backups before the dedupe happens."

"Based on what I'm seeing in this white paper, to me this looks like a poor man's solution to single-instance storage," Wendt said. "I wouldn't view this as a robust way to manage it. It appears it will work, but it's really not suited for the enterprise. There are so many qualifiers here, and even NetApp recommends just doing it under certain circumstances and infrequently."

Read the entire white paper on A-SIS.

Read more on IT for small and medium-sized enterprises (SME)