Artur Marciniec - Fotolia

The object storage system elephant in the room: It can't go on tape

Object storage systems are touted as ideal for large data volumes and archiving, but there's no way to get object storage onto tape because of its metadata-based approach.

You can't do it -- get object storage taped, I mean. There is no way to get the contents of an object storage system onto tape. Instead, it has to stay on spinning disk forever. Does this matter?

Object storage system vendors such as Amplidata, NetApp (with its Bycast acquisition), Caringo, EMC (Atmos, Centera), Cleversafe, DataDirect Networks, Hitachi Data Systems (Hitachi Content Platform) and Scality position object storage as suitable for long-term storage of billions of objects. These objects are stored with metadata, uniquely identified through some kind of hashing and with self-healing capabilities.

Here's some of what the vendors say about object storage. A Caringo statement, for example, concerning its integration with Symantec's Enterprise Vault reads, "Caringo enables Symantec customers to save up to 40 percent in total cost of ownership over NAS or tape." Forty percent lower total cost of ownership than tape -- that's an amazing claim.

And here's an HDS seminar flier statement that reads, "Join us for a live webcast and hear how object storage reduces or eliminates the need for traditional tape-based backups." It says tape backup systems are unreliable and error-prone and its object store is "designed to need no backup," because of its continuous data integrity checking and proactive data repair. It offers lower OPEX, lower CAPEX and better data protection than tape. Really? For billions of objects?

The object storage system vendors make a big deal about their products being suitable as an active archive because everything is online. That's true, but objects age like every other digital container. And as archives approach and pass the billion-object level and head towards the trillions, millions (or tens of millions or even hundreds of millions) of those objects won't be accessed very often, if ever. Everyone intuitively knows that.

And everyone knows that using energy to keep disks spinning to store objects that no one is going to access for years is economically and environmentally stupid. The cost per gigabyte of disk storage is higher than the cost per gigabyte of tape storage. Add in management costs as much as you like, but remember tape has media checking and Tape libraries that can self-heal; the software has advanced, and tape cartridge densities are up to 5 TB raw and heading to 10 TB and beyond.

As the amount of data to be stored grows and grows, tape will become the lowest-cost option. For high-volume data archive capacities, disk economics suck, and it’s no use pretending data deduplication and thin provisioning can change that.

All of which means that object storage system vendors have their collective heads in the clouds; they’re envisioning a “big data,” mass-use nirvana coming for their to-date niche-limited products. It's a collective fantasy that, with their object storage hammer, every big data storage problem is a nail they can hit. It's not. Get real; cold objects shouldn't be stored on disk. It’s as simple as that.

What is needed is a way to drain off cold, inactive objects from disk and stuff them into a tape archive. Isn't it obvious? Yes, tape is slower than disk, but tape is greener than disk and lower cost than disk and at the trillion-object level, that will matter a great deal.

It's no use waiting for backup software vendors to come up with a solution here. They have as much understanding of object storage as I do of quantum thermodynamics. Let's suppose we want to move a hundred objects to tape from a Scality Ring system. The first thing the backup app would need to know is where they are. And that’s not an easy question to answer. Ring is a distributed set of nodes with objects and metadata sliced and diced across it. And that's just Scality. There is no single open standard object storage system; they are all different. Amplidata objects are stored differently from Caringo objects, which are stored differently from … . You get the idea.

The most obvious source for an object-storage-to-tape transfer technology is an object storage system vendor. But, those vendors have no motivation to develop such a thing. Tape archives are their market enemy, and none of them have yet sold systems that are up in the multipetabyte capacity level, where media costs become huge, or the exabyte level, where media costs are terrifying and disk failures a daily occurrence.

Inside the really big cloud storage vendors, such as Amazon, Azure, Google and Joyent, the strategists extrapolating storage volumes and thinking through capacities and costs and future media purchases will be aware of these things. Somebody somewhere will run storage capacity and management cost projections with object storage and with tape, will have seen the result and shaken their heads in disbelief. “Thank heavens that's not a problem for today,” they will say, “because, down the road, object storage is not going to be cheap and tape will look like a very good value indeed. Now if only we could trickle cold and nearly dead objects off to tape, we could have lower costs than our competitors. How could we do that?”

There's an unacknowledged elephant in the object storage room. The object storage system suppliers might as well get acquainted with it and say, "Hi." It could be a cost saviour in years to come.

Chris Mellor is storage editor at The Register.

Read more on Storage fabric, switches and networks