Document management softwares stake a claim for tiering duties

Storage tiering is a hot meme, and rightly so because it can lower costs and is an important way to use the speed of solid state disks to your advantage. But with three tiering choices in the market – hardware, storage software and document management software - all offering tiering features, how can users decide which tool is the best tiering option?

Price lists can sometimes be eloquent documents, and never more so than in the field of storage, as when SearchStorage ANZ went on-line to research this article we found two terabyte SATA disks for $220, a $300GB fiber channel (FC) drive for $950 and 32GB solid state disks (SSDs) for $180.

SATA therefore comes in at 11 cents a gigabyte and FC at $3.11 a gigabyte, while SSDs hit $5.62.

This wide variance in price neatly illustrates the need for operating different tiers of storage, as it simply makes no sense to spend $5.62 to store a gigabyte of data when it can be done for eleven cents. The flip-side, of course, is that SSDs are far faster than SATA drives. Organisations therefore understand that old data belongs on slow, cheap, storage, while the data applications need to use the most deserves a home on storage media that can deliver data with slippery speed.

Storage aficionados may note, at this point, that such ideas are not new. Hierarchical Storage Management (HSM) and Information Lifecyle Management (ILM), both advocated very similar practices.

The reason for the idea’s revival is threefold, with the always-attractive prospect of controlling costs foremost.

A new and important consideration is users’ desire to use the blinding speed of SSDs to their advantage, as these new disks can greatly improve end-users’ experience of applications by speeding data to their eyeballs with pleasing haste.

Lastly, hardware and software vendors are now baking tiering into their products. The former are doing so in order to offer a mixture of disk types within a single chassis, so that users can serve data of different types from one box. The latter are interested in tiering as part of their overall mission to manage their customers’ data.

But a third contender is also having a go at tiering, as document management vendors apply their process-centric view of the computing world to the task and insist their closeness to the business – instead of the arcane world of storage arrays – means they should get the job.

Process engines trump storage hacks?

James Latham, Senior Vice President and Chief Marketing Officer for Open Text says his company’s tools are designed to serve as a “process engine,” shuffling documents so that process workers can always access them as and when needed to perform tasks like assessing insurance claims.

In that role, Latham says his software can also “archive and compress old data to manage assets,” a task that sees data made more or less available depending on age or other policy criteria. This approach, he believes, results in better outcomes than storage-centric tiering, as classifying data according to its relevance to a business process is more powerful than doing so according to the cruder of metrics a storage device values.

Latham is, however, happy to co-exist with storage vendors’ tiering efforts, even though Open Text software is not aware of the APIs storage vendors offer.

Keith Busson, Quantum’s Country Manager for Australia and New Zealand, sees some merit in using a document centric approach, which he says can “ ... potentially deliver high-performance access to this data.”

“There is some dependency to the hardware,” he says, “but the application can track where the data is and can retrieve it in a timely manner when requested. You also have more freedom of choice regarding hardware components to deploy which can potentially lower the overall cost of the solution.”

Busson also sees some downside.

“Protection and proper archiving can be a challenge though. A document-centric approach will set protection at the software layer, but could potentially be at risk of hardware failures if the proper hardware precautions are not considered. You may also have to consider incorporating third-party technologies to do things like replication which eats into your desired cost savings. Furthermore, this approach may be limited on where and on what platforms it can be deployed. It probably is also limited to a specific operating system or network connection, which can limit the flexibility of the solution.”

Some data escapes the CMS

Paul McClure of CommVault also sees some downside in letting document management suites drive tiering.

“Not all data that needs to be tiered belongs in a document management system,” he points out, citing images or MP3 files as likely escapees from such applications. Document management systems are also incapable of addressing documents they do not manage, and do not concern themselves with other data management tasks like backup and maintenance of archives.

He therefore advocates storage management software as the most logical tool with which to tier, as they not only offer greater coverage but also “offer more efficient and secure means of data storage by leveraging technologies such as compression, deduplication / single instance storage and encryption.”

McClure concludes that “The best approach is to select an archiving vendor that provides integration with document management systems allowing for the data in the archive to be available (on demand), searchable and if relevant declared as a record into a document management system.”

Paul Lancaster, Director of Systems Engineering, Symantec also has issues with document-management-driven tiering, which he says “may claim to do some tiering but ultimately this requires an organisation to make changes to business processes, which can be difficult to implement. However, if we look at the management side from an application owner, document management may be able to put in place prerequisites that are purely limited to the document management applications.”

“If document management is in charge of tiering, organisations will probably have to redesign the associated applications and there would also then be a need for associated changes in the methods of data protection.”

Lancaster instead advocates tiering at file level, because the interrogation files it requires creates metadata that allows tiering to be applied to data beyond the document management system.

He also points out that Symantec can tier data at the moment of its creation, saving the need for later classification and movement.

Co-existence is possible

But hardware vendors are also happy to co-exist with other tiering players.

Adrian De Luca, Hitachi Data Systems’ Director, Pre-Sales & Solutions for Australia and New Zealand says “there can be synergies doing it in both layers of the stack.”

“The advantage of implementing tiering at the software or application level means the policies for moving the data between tiers can be very sophisticated since it based on rich metadata associated with the content. This is great if you want to implement an archiving policy for retention purposes. However, the downside is that you will need to create policies for each application which can become complex and end up being a management nightmare.”

“Storage vendors on the other hand can implement tiering at the volume, block or file level which can address solve a number of other problems. For example, if a critical application requires more performance, promoting the data ‘volumes’ from tier 3 to tier 1 non disruptively can provide immediate relief. Some storage technologies can move individual ‘blocks’ from a volume, this is particularly useful if you have any ‘hot’ areas being accessed frequently to improve transaction times. File servers are a perfect candidate for tiering due to the rich nature and value of the files they hold, being able to move files based on age, accessibility or size between different tiers ensures data is stored on the right cost of storage. Some file server technologies offer this built into the appliance. All these scenarios can be managed from a common software suite which significantly reduces management.”

De Luca therefore concludes that “software and hardware tiering can be quite complementary.”

“I don’t believe that array vs. document management/archive is an “either/or” proposition,” says NetApp Consulting Systems Engineer John Martin. “Much like the data replication market, where applications such as Exchange and Oracle can provide compelling availability by integrating data replication within their products, these are often used in conjunction with array based replication to protect the rest of the data the does not reside in those databases. As a result, it is likely that many organizations will use a combination of array based data placement optimization along with document management and traditional archiving approaches where these provide additional compelling functionality beyond simply reducing storage costs.”

But EMC’s Chief Technology Officer, Marketing, Clive Gold thinks co-existence is not a good idea.

Asked what users should do if they find themselves with overlapping and competing tiering systems from hardware AND software vendors, he recommends making a choice of one tiering platform.

“Organisations need to disable one of them,” he says. “The trend is to more intelligent infrastructure that automates this level of management. This results in lower administration, more performance and better utilisation. This is the dream of the Virtual Private Cloud - low-cost, highly efficient and flexible computing.”

So which to turn off? IBRS Analyst Dr Kevin McIsaac says the choice may not be very important.

“The storage guys will say ‘pick me’ because:

  1. This is what they know
  2. They will say we can do this for any app.”

“The content management guys will say ‘pick me’ because:

  1. This is what they know
  2. They will say it is tightly integrated with the content management system and we can drive more benefits, such as tiering any storage from multiple vendors.”

So unless you can clearly identify that one platform is strategic more important to the organisation, it really does not matter.”

Read more on Master data management (MDM) and integration