XAM: a guide to standardising archives

How much money has your organisation invested in storage systems and processes over the last decade? Chances are you have spent many millions of pounds storing...

How much money has your organisation invested in storage systems and processes over the last decade? Chances are you have spent many millions of pounds storing information that needs to be kept safe for legal, financial and regulatory reasons.

But how long will you be able to access that information? If your storage systems became obsolete, would the information be lost? Will the technology exist to access your archives in 10, 50 or even 100 years' time?

The idea that archives could be lost through technical obsolescence is a phenomenon that the Storage Networking Industry Association (SNIA) refers to as the "digital dark age".

It is certainly a real enough concern for many IT managers. A survey conducted by SNIA earlier this year found that 60% of IT leaders were concerned that their archived data would not be accessible in 50 years time. That might not seem like a big deal - but 68% of the same IT managers said their company archived information that would need to be retained for as long as 100 years.

Long-term archiving is increasingly hitting the IT manager's agenda, says Jay Mastag, vice-president of development and general manager with storage supplier EMC. The company has seen demand for long-term archiving systems grow by 60% in the last year, but many organisations buying systems seriously doubt their longevity. "We are seeing this become a real concern for managers. What do they buy, how confident are they that suppliers and technologies can last the distance?"

Archives tend to be stored on a particular kind of storage system known as content-addressable storage. This system is ideally suited to archiving because it is write-once, read many times and stores information in non-modifiable, original form. It is often used to store digital images, medical records, signed contracts and video files - exactly the sort of things organisations are being compelled to archive by regulators, says Mastag.

However, content-addressable storage comes with a major downside. If you buy a content-addressable storage system such as EMC Centera, you will need an archiving application that is written specifically for the Centera application programming interfaces (APIs). The application will not write to an HDS system, nor will an HDS-qualified application write to an EMC system. Virtually all content addressable storage is proprietary, and applications must be rewritten to write to each specific back-end system.

The result for users is often supplier lock-in. If your archive is on an HDS system, there is very little opportunity to move to an EMC system later on - unless you want to take on the substantial cost and risk involved. This then raises the spectre of supplier viability - what happens if your chosen archive system suddenly disappears?

"If you store information with Application A, and then 10 years later no longer have that application, how can you be sure you can retrieve your archive? You cannot," says Frank Bunn, a board director with SNIA. "People, therefore, have to spend a lot of time investing in archiving tools they believe have longevity, and that is not always possible. It is a real risk, and it is not ideal."

Lack of interoperability has been a major issue in the industry for many years, says Bunn, and developers and IT professionals have been pushing for improvements for many years. Now, the storage industry says it has finally found a solution: XAM.

XAM, or eXtensible Access Method, is a new, high-level, open standard that governs content-addressable storage. (In fact, XAM is so new that depending on who you speak to, the standard is either pronounced "zam", "exam" or "X.A.M").

Simply put, what this means for users is that you will now be able to buy XAM storage systems from the likes of Hewlett-Packard and IBM, and migrate data between the two. Applications written for an HP XAM system should also work with an IBM XAM system. This, SNIA says, is great news for users: improved archive security and longevity, increased supplier flexibility and reduced integration and development costs.

The standard is good news for application developers, too, adds Carl Greiner, an infrastructure analyst with Ovum. "It makes life a lot easier for software developers because there will be very little extra work in qualifying an application for HP once it has been qualified by Sun."

XAM was originally proposed as long ago as 2003, when SNIA was in the process of creating the SMI-S specification for storage hardware interfaces. That standard did provide a degree of interoperability between storage products but not at the same level as XAM, explains SNIA's Bunn, "The difference between SMI-S and XAM is that XAM is operating at a higher level," he says. "Whereas SMI-S looks at hardware interfaces, XAM looks at metadata, and the characteristics of information, so the hardware interface does not come into it."

The XAM protocol is highly complex but the metadata allows the complexity to be almost completely hidden from IT managers and application developers, says Bunn.

XAM works by looking at archived data in a different way than existing storage APIs. This is because the new standard incorporates a metadata framework that allows individual items in an archive to be associated with a unique, rich metadata tag. This metadata does not change even if the data is moved to another system - from IBM to HP, for example. This means the information can be identified and discovered by a future XAM system, no matter where it resides at that point in time. XAM, therefore, provides security for IT managers who are aware that archives are likely to be moved or migrated during their lifecycle.

By using metadata, XAM also can express key characteristics about the information in an archive. For example, there is a XAM standard for describing different content forms, including defined metadata formats for e-mail, including sender, recipient and subject.

This detailed metadata is designed to aid companies with future e-discovery and search functions. Using metadata, it should be possible for IT managers to interrogate archives in great detail. In addition, managers can use the metadata framework to create information lifecycle management policies, automatically deleting or retaining information in line with corporate needs, adds Mastag. "All of this is built into XAM, so the management costs of archives will be substantially lower," he says.

This complexity is the reason why XAM has taken five years to get to this point, says Greiner. "Everyone had their own ideas about what should be left in, and it has been watered down a little over time, but there is still a lot of rich functionality within XAM," he says.

Today, there is a software library available for XAM on the SNIA website, which can be downloaded by application developers. Version 1.0 of the XAM specification has been ratified and SNIA is hoping to accelerate development of the standard in 2009, says Bunn.

Certainly, XAM has heavy industry backing. Suppliers that have committed to supporting XAM include heavy-hitters such as EMC, IBM, HP, Network Appliance and Sun Microsystems. "We currently have more than 45 suppliers on board with XAM, but I think many suppliers will wait until the standard is more established before jumping on board, so we are keen to help get systems up and running," he says.

Currently there are no XAM products on the market, although several prototypes were demonstrated at the most recent Storage Networking World trade show. EMC released a XAM support software development kit (SDK) and device-side XAM protocol on August 28 and further announcements are expected in the coming months from fellow SNIA members IBM, HP and Sun. EMC says that XAM will be available in Centera before the end of 2008.

Where organisations are currently using non-XAM products, Bunn believes that suppliers will begin offering conversion tools for XAM from 2009, although he doubts there will be full backward compatibility. "I think the most likely solution is that you will be able to migrate storage, to read data on non-XAM and move it to XAM in future," Bunn says.

Why are suppliers so keen to promote a standard that could loosen the grip they have on customers? "Simple. XAM will enable a whole new wave of storage innovation," says EMC's Mastag. "What XAM will do is create a level playing field in storage. You can no longer keep a customer simply because they cannot move to another supplier. You have got to work harder you have got to provide more," he says. "Who knows what sort of innovation the application developers will create once they are free to write to multiple systems?"

Although it is still early days, Greiner recommends IT departments put XAM on the storage shortlist, as he believes that it offers faster deployment, more agility and increased flexibility.

In fact, Greiner believes that XAM will be the de facto standard for storage applications and systems in as little as three years. "There is tremendous demand for something like XAM and I think it is all looking extremely good," he says. "Is it 100% today? Hell no, but it is a heck of a nice beginning."

What is XAM?

The storage industry is working towards a standard for interoperability called XAM, which is designed to make it easier for companies to store information in archives and migrate to other systems in the future without modifying the archive contents.

XAM provides a standard interface enabling archiving applications to write to content addressable storage systems without custom APIs. This means enterprises can migrate archives from one supplier's hardware to another via XAM.

The key organisation behind the launch of XAM is the Storage Networking Industry Association (SNIA), which has the support of EMC, IBM, Hitachi Data Systems, Network Appliance, HP and Sun. SNIA has released a software developers' kit for XAM and has released version 1.0 of the API specification.

The first XAM products are already available - EMC Centera and HP's Integrated Archive Platform support XAM 1.0. it is expected that applications for XAM archives will be available from early 2009.

Read more on Network software