Ten considerations for email archiving - Part 1

Email archiving products vary in their features and technical structures. Here's how to select an archiving tool that's a good fit for your company's needs.

More and more companies are archiving their users' emails for business and legal reasons. If you haven't standardised on an archiving product, it can be a time-consuming process to find one that fits your company's needs; there are many choices available and each tool has unique features.

When examining an email archiving product, it's important to know how well it's suited to the specific requirements of the email system it's intended to protect. I've reviewed many of these products and compared their functionality to the requirements of dozens of companies. The following 10 questions will help you narrow down the available email archiving products to those that best serve your needs.

Not all of the following 10 questions will be important to every storage environment, but each one should be considered when making a product selection. You should decide whether or not a particular function is important in your environment. Not all email archiving implementations require legal-hold capability, for example. There can also be a spectrum of answers to each question, and not every environment needs the most extreme, feature-rich solution.

There are many considerations beyond the technical issues outlined here. One of the primary deciding factors in any technology purchase is cost, which itself includes many variables. Vendor reputation, customer service and geographic support coverage may all influence product selection. While these factors aren't taken into account in this article, any one of them may have an impact and must be carefully considered.

  1. How complete is the archive?
    Not all email archiving solutions capture every email, but that might not be desired. In some environments, only messages sent or received from the outside world need to be retained, so an email archive that uses a gateway approach would be acceptable. But many organisations require a more complete set of email messages, so the archive must interact with the mail server to ensure that all messages, both internal and external, are retained.

    Even if an email archiving application captures inside and outside messages, some messages may still fall through the cracks. Archives that "sweep" through the mail system on a scheduled basis can miss messages that are sent, received and deleted between sweeps. Since every message has both a sender and a recipient, both of them would have to delete the message (and potentially empty their trash folder) to hide a message in this way, which is often called a "double delete" scenario. Organisations that are focused on compliance must ensure that their email archive captures every message.

  1. Does it record what people do?
    One step beyond a complete set of messages is an archive that maintains a record of user actions. Some systems are capable of recording whether a user opened, forwarded, flagged or filed an email message, a feature that has proven popular in product demonstrations.

    However, "just because a message is marked as 'read' doesn't mean that a user really read it," says Matthew Ushijima, director of IT network operations at Empire Today. "Outlook's preview pane can interfere in both positive and negative ways, making this [product feature] not the most reliable data source," he adds.

    Capturing the actions users take regarding their email messages is a difficult technical problem. Traditional archiving products, which commonly use Exchange journaling, must sweep through the mail system using MAPI to periodically examine each message to capture this so-called user-action meta data. MAPI sweeps consume valuable CPU and IO resources, so additional mail servers must be added to handle the load. An alternative approach to archiving, called log shipping, doesn't require these intensive sweeps, but is much less common. Consider whether this kind of user-action information is critical to your archiving needs.

  1. Can the archive ingest an existing mail store or PST files?
    Many organisations would like their email archive to include messages that existed before the archiving application was installed. These messages typically come from the mail system itself, which might include a decade or more of old mail, as well as from offline or user-created archives, like the PST files created by Microsoft's Outlook mail client. Many archiving programs are able to pull in these old messages, but some can't (see "PST indigestion," below).

    Bringing in old messages from a mail server generally requires an intensive migration process using the MAPI protocol. This can take a few days, so the process is often performed over a weekend; large environments and those with email servers in multiple locations may find that it takes much longer.

    Most email clients store personal archives on local disks, so these may be anywhere your users are, including laptops, desktops, network shares and portable drives. This makes importing archives tricky, as they must first be located and consolidated. Not every system can handle all formats, which can range from Outlook PST to Notes NSF, to Unix mbox and maildir files.

    No matter where historic messages are imported from, the archive that contains them should be flagged as incomplete and potentially unreliable if ediscovery is a consideration. Both email servers and personal archives are almost certainly missing a great many messages. It's a trivial operation to change the content of most personal archives; modern email archive systems are far more tamper-proof.

  1. Can the archive handle multiple email systems?
    Not every email archiving application is capable of handling multiple email servers. If your environment features more than one email server, and especially if a variety of email systems are in use, this feature could prove critical. Generally speaking, archives that use a messaging gateway are far more flexible in heterogeneous environments than those that integrate more directly with the mail system.

    This is especially common in organisations created as the result of corporate mergers, but some organisations find themselves in possession of heterogeneous mail systems for historic reasons. Whatever the cause, many email archive solutions don't support all of the various email servers, including Microsoft Exchange, IBM Lotus Notes/Domino, Unix mail and Apple's mail server.

  1. What about non-message content?
    Some email archiving applications focus only on messages, while others can also archive calendar items, tasks and contacts. A few also support other applications, including file systems, instant messages and database applications. Not every environment needs this type of archiving, but be sure to set expectations with management and your legal department about what is and isn't saved. While some archiving systems support content outside the email system, "email is the most critical," maintains Kelly Ferguson, senior product marketing manager for email archiving at EMC Corp. "Including file systems and SharePoint is nice, but email must get under control because it has the biggest risk due to message proliferation. Customers are starting with email, but have the expectation that the system can expand to other content types as need arises."


PST Indigestion

Eliminating "Underground Archives" like Microsoft Outlook PST files is a primary goal of many email archiving projects, but one that often proves difficult to attain. It's a simple matter to turn off PST archive support in Outlook, but this must be put off until existing archives are located and ingested. Remind users that the new archive will actually make their mail more available to them; with the company archive they may now be able to access their old messages from Outlook Web Access (OWA) and BlackBerry devices.

But beware when importing old archives that have been out of your control. At the very least, they're incomplete, as users almost certainly selectively saved email, deleting some, keeping others in their inbox and archiving a few. It's also possible for a malicious user to have changed the content of one of these personal offline archives, creating new messages, or deleting or modifying old ones. Therefore, you must consider how reliable this source is from a legal or compliance perspective.

If you're applying a deletion policy to email, consider suspending it, at least temporarily, when it comes to PST imports. If you import old archived mail and then immediately delete it, you'll lose credibility in the eyes of the very users you're trying to help, and possibly raise compliance and legal issues. Give your users enough time to categorize and thus preserve their imported messages, and then educate them about the importance of retention and destruction.

Read more on Storage management and strategy