The vendor will make its appliance and software generally available next Monday. The data reduction package includes...
the Ocarina Optimizer, the Ocarina Reader and the Ocarina Manager. The Optimizer performs the compression functions, the Reader decompresses files for viewing and the Manager is the user interface.
The Ocarina Reader is a software agent that can be installed on workstations or servers in order to unroll files the Optimizer has compressed for later viewing. The Reader is needed to restore optimised files, and any Reader in any location can be used to read any file in Ocarina format.
The Optimizer and the Reader can be used with CIFS and NFS file systems on both monolithic or clustered NAS systems and support about 100 proprietary file formats, including Microsoft Office, video and image files.
Tackling pre-compressed and multimedia files
Once the machine has copied a file out of the primary data store, it pulls apart the file to extract the component objects. For example, a PowerPoint file could be broken down into text and .jpg or .png graphics. From there, each storage object is compressed, and redundant objects are consolidated across files. Finally, the Ocarina box returns the optimised file back to the primary storage device. The data is protected against corruption with a set of checksum algorithms inserted into the header of each file.
Ocarina claims a 10:1 data reduction ratio, an improvement on standard compression technologies that usually hit a ratio of about 2:1. It achieves that ratio by pulling apart standard file formats, many of which are natively compressed, and applying proprietary algorithms to further compress the objects within those files. These algorithms also make it possible to create a three-dimensional cube of numeric values to represent a photo or video image.
"So if you take some photos on a day at the beach, our algorithms will be able to look for similar boundaries, such as light levels, that it's seen before," said Carter George, Ocarina vice president of products. "It's like computer vision."
European photo-sharing site expects to save 200 TB
So far, Ocarina has one publicly-named user in the Web 2.0 multimedia market and claims more big names in that space are testing the product. Graham Hobson, chief technology officer of Photobox, a photo sharing and printing site, said his company has been testing Ocarina products since their pre-alpha stage two years ago. He intends to put them into production next week.
Hobson said that for the first six years after the company was founded in 2000, the cost of data storage fell each year at a rate that kept up with Photobox's capacity growth. But about two years ago, that trend stopped. "We're dealing with bigger files now," Hobson said. The monthly cost of rack space in the company's co-location data centres in Europe has increased around fourfold because of increasing energy prices.
Photobox currently has about 800 TB of capacity on Isilon clustered NAS systems, with about 600 TB used. "At this rate, without data reduction, we'll exhaust that capacity in the next three months," according to Hobson. With Ocarina, the company is hoping to make that capacity last through September, a savings of about 200 TB.
Primary data reduction still a bleeding-edge market
Ocarina is not the first to market with a data reduction product for primary storage. Storwize also compresses data on primary NAS systems. However, Storwize sits in the data path and passes through files it can't optimise. The Storwize appliance is also needed to recover data, and a separate Storwize appliance is needed to receive and restore data at secondary sites.
Ocarina is using this difference between the products to market itself as less risky than Storwize's "bump in the wire" approach, something Hobson said gave him more confidence in the Ocarina product.
But Ocarina's data reduction isn't for everyone. In addition to worries about data corruption that secondary storage data deduplication makers have also faced, Ocarina's "computer vision" means it is literally reading files, a big risk for a security-conscious environment.