Copan adds data deduplication to Maid

Copan is claiming that with new support for FalconStor's data deduplication in its virtual tape library, it can support up to 6Pbytes in 10 cubic feet of space.

Copan Systems is adding data deduplication and block-level access to its MAID platform.

Copan signed up with OEM partner FalconStor Software Inc. to add deduplication to its virtual tape library (VTL) in the newest revision of its Revolution product, the R300.

Density for archival storage has always been Copan's focus. Its MAID architecture spins drives up and down according to frequency of access, letting it pack more disks into one box without as much space. The 300 Series will now support 750 GB SATA drives for up to 672 terabytes (TB) of physical storage in one cabinet, according to John Mellon, senior vice president of worldwide marketing and business development. Copan is still working on qualifying 1 TB SATA drives, he added.

More on data deduplication
EMC repackages Avamar data deduplication

Data Domain's CEO takes on nearline storage

Quantum first with data deduplication flexibility

Users dish on Symantec PureDisk
Copan is no longer the only MAID in town, though. NEC Corp. of America, Nexsan Technologies and Fujitsu also ship MAID systems and Hitachi Data Systems (HDS) last month revealed a unique version of the architecture. That makes Copan's features more important than when it first launch its platform in 2004.

Serving as a VTL has been Copan's primary function, and it is now supporting FalconStor's Single Instance Repository (SIR) data deduplication software under the VTL interface it also OEMs from FalconStor. The company claims that SIR, along with MAID, will allow up to 6 petabytes (PB) of logical storage to be packed into one chassis, which takes up about 10 cubic feet.

FalconStor's SIR has been available for more than a year, but none of its major OEM VTL partners offers the feature. Pillar Data Systems Inc. became the first to sign on last month, announcing it will support SIR along with other vendors' data deduplication software later this year. Copan will be the first to ship SIR, beginning Oct. 15. FalconStor OEM VTL partners EMC Corp., IBM and Sun Microsystems Inc. have yet to qualify SIR with their products.

Copan said the SIR function has been put through performance testing and claims it will not cause a performance hit on the VTL. (The addition of new quad-core Intel controllers in the 300 will probably also help). Copan also claims several beta customers have been testing the system successfully for the last couple of months, although none were available to talk to the press.

However, the two companies have had to go through extensive development to get the VTL software to work with the MAID disks, and further development along these lines was needed in order to ensure that data deduplication metadata will be stored in "always on" regions of the array.

Just how much time drives actually spend idle in real-world VTL environments is a question that's been raised about MAID systems in the industry recently.

But one Copan user, Matt Johnson, senior software system specialist for the University of Texas Medical Branch (UTMB), said his Copan system averages only about 25% of the disk drives active at a time. Johnson's Copan array holds file system backups from Unix and Windows servers.

Johnson also said the long qualification process doesn't make him wary about trying out data deduplication, because his primary concern is being able to pack as much data into the box as possible. "The downside will be that [with data deduplication] is that it will probably take a little more time to reconstruct [data]. But reconstruction time will be an issue with whatever vendor you choose," he said.

Aside from the VTL, the R300 supports network attached storage (NAS), archiving and new block-level access. Copan calls these "personalities," and multiple personalities can reside on the same frame divided by disk shelf. To change a personality, an administrator must migrate data off the shelf first. Data deduplication is only supported as part of the VTL personality.

Despite the block-level access feature, Copan is not recommending that the box be used as a storage area network (SAN). Block-level access is intended for allowing software applications that require block-level access to communicate with the R300.

The archive is not being positioned as a tool for meeting regulatory compliance but a repository for files that will be offloaded to secondary storage. It uses a proprietary file system that Copan claims is immutable but isn't fully WORM compliant. WORM is on the roadmap for 2008.

One analyst cautions users to keep in mind the effect of ultra-dense systems on their data centers. "Not a lot of people think about this, but if you're packing so many disks into one box, with more and more bits on them, the weight of these chassis can sometimes overwhelm raised-floor tiles," said John Webster, principal IT advisor with Illuminata Inc.

Starting list price for one shelf and chassis is around $125,000, plus software license fees.

Read more on Business applications