Archiving is becoming one of the biggest headaches in storage
management today, namely because there is no single, cohesive
approach to archiving everything, according to customers at a
recent
EMC user group meeting .
About 20 or so EMC Documentum users gathered for the first
quarter 2007 meeting of the Northeast EMC CMA User Groupand
the issue that cropped up the most was companies having to knit
together different products in order to have a complete archive for
records management that would also satisfy legal discovery
requests.
"What should go into an [EMC] EmailXtender archive and what goes
into the [EMC] Documentum email archive … how do we bridge the two,
decide what goes into which repository and then be able to search
through both repositories," said a messaging and collaboration
architect at a global pharmaceutical company.
He added that if there has to be separate archives, then the
"architecture has to be fluid" to allow data to flow between them.
"EMC is very good about this concept of storage at the physical
layer, but there needs to be a tiered structure between the
software layers as well." This same model must extend to all
documents, not just email, he said.
His sentiments were echoed by another user in the insurance
industry, who said that his company is pulling in 1.5 million
emails a day and is struggling to figure out what to keep and where
to store it.
He calculated that with the 40,000 PST files his company
reluctantly stores, it would take 4,000 hours of productivity a day
to correctly classify that data. "I don't think I can get approval
for that," he joked. Furthermore, two people getting the same email
may classify it differently, so which is the correct place and
retention for it? Ultimately, he said he's decided that automated
classification is the only way to go, even though there will be a
margin of error in this approach, too. "We need federated policy
management that knits together policies across all systems," he
said.
Another user volunteered that his company had identified 300
different document classifications and that none of the
classification tools came close to meeting this requirement
today.
A spokesperson for EMC asked the audience whether they thought
keyword classification on every single email that comes into an
organization was a good idea, versus full-text indexing. The
consensus among users seemed to be to start with high-level
classifications and then get deeper into it as the software and
users get more sophisticated. "Don't forget your classification
system is only as good as the queries your people write," one user
said.
Mergers and acquisitions are another factor driving users to
figure out how to standardize on a single records management
system. "We need to be able to bulk load 20 million documents a day
into Documentum … there's no good options when company A using a
common store and company B using Documentum comes together,"
another user said. "The repositories are completely different."
An EMC official said that Documentum 6.0 (D6), expected later
this year, will address many of these issues. In the meantime, he
advised users to consider EmailXtender for email that has a shorter
life span and Documentum for longer term records management. But
it's clear from users this is easier said than done.