EMC users highlight storage challenges

An EMC Documentum user group discusses issues from how to build a single records management system, how to manage PST files and how to satisfy legal discovery requests.

Archiving is becoming one of the biggest headaches in storage management today, namely because there is no single, cohesive approach to archiving everything, according to customers at a recent EMC user group meeting .

About 20 or so EMC Documentum users gathered for the first quarter 2007 meeting of the Northeast EMC CMA User Groupand the issue that cropped up the most was companies having to knit together different products in order to have a complete archive for records management that would also satisfy legal discovery requests.

More records management info
Weekly compilation of storage news: TB drives unleashed

Symantec makes major update to Enterprise Vault

Tape restoration firm accelerates restores

Solving storage cooling and power concerns
"What should go into an [EMC] EmailXtender archive and what goes into the [EMC] Documentum email archive … how do we bridge the two, decide what goes into which repository and then be able to search through both repositories," said a messaging and collaboration architect at a global pharmaceutical company.

He added that if there has to be separate archives, then the "architecture has to be fluid" to allow data to flow between them. "EMC is very good about this concept of storage at the physical layer, but there needs to be a tiered structure between the software layers as well." This same model must extend to all documents, not just email, he said.

His sentiments were echoed by another user in the insurance industry, who said that his company is pulling in 1.5 million emails a day and is struggling to figure out what to keep and where to store it.

He calculated that with the 40,000 PST files his company reluctantly stores, it would take 4,000 hours of productivity a day to correctly classify that data. "I don't think I can get approval for that," he joked. Furthermore, two people getting the same email may classify it differently, so which is the correct place and retention for it? Ultimately, he said he's decided that automated classification is the only way to go, even though there will be a margin of error in this approach, too. "We need federated policy management that knits together policies across all systems," he said.

Another user volunteered that his company had identified 300 different document classifications and that none of the classification tools came close to meeting this requirement today.

A spokesperson for EMC asked the audience whether they thought keyword classification on every single email that comes into an organization was a good idea, versus full-text indexing. The consensus among users seemed to be to start with high-level classifications and then get deeper into it as the software and users get more sophisticated. "Don't forget your classification system is only as good as the queries your people write," one user said.

Mergers and acquisitions are another factor driving users to figure out how to standardize on a single records management system. "We need to be able to bulk load 20 million documents a day into Documentum … there's no good options when company A using a common store and company B using Documentum comes together," another user said. "The repositories are completely different."

An EMC official said that Documentum 6.0 (D6), expected later this year, will address many of these issues. In the meantime, he advised users to consider EmailXtender for email that has a shorter life span and Documentum for longer term records management. But it's clear from users this is easier said than done.

Read more on Operating systems software