Systems needed to resolve e-mail crunch

With the regulations for data storage becoming increasingly demanding, finding workable solutions to e-mail archiving requires a sophisticated approach

Nobody has summed up the phenomenon with a Moores Law-type formula yet, but it seems that as the global volume of data increases, the rules obliging you to keep it all for many years multiply exponentially.

Nowhere is that more evident than with e-mail. Market research firm IDC estimates that the volume of business e-mail sent worldwide in 2007 will approach five exabytes – about one billion gigabytes – a volume that has doubled in each of the past two years. At the same time, the Storage Networking Industry Association (SNIA) estimates that there are about 10,000 regulations worldwide that compel organisations to retain data.

The move to paperless working was supposed to herald a new age of simplicity and efficiency, but it has actually made the keeping of records more difficult for business, says Hamish Macarthur, chief executive at analyst firm Macarthur Stroud International.

“To archive all business transactions so they can be retrieved and used was straightforward in paper-based working, but as we virtualise everything it has become more difficult to keep the information we need for legal and regulatory reasons,” says Macarthur.

There are a number of UK regulations that enforce the retention of e-mail. The Freedom of Information Act dictates that public sector bodies must be able to supply copies of “recorded information” on request, even if generated before the act came into force in 2005. Through the Data Protection Act, a member of the public can request information held about them. The US Sarbanes-Oxley regulations apply to e-mails containing company financial information for Nasdaq and NYSE-listed businesses and their UK subsidiaries. And there are also industry-specific controls, such as those enforced by the Financial Services Authority, which require all e-mails be held for six years.

In addition, there are the increasingly onerous evidential requirements of court proceedings, which require a verifiable e-mail audit trail from “reliable” systems to be admissible. Providing this quickly can save a lot of money and mean the difference between legal success or failure. Financial services firm Morgan Stanley found this out last year when US courts fined it £7m for not keeping complete e-mail archives.

Organisations are caught in a web of growing e-mail volumes and increasingly regulatory requirements. “In the UK, the primary drivers are regulatory compliance, storage resource management, e-mail serviceability and recovery,” says Glenn Weavind, senior consultant at CA. “Litigation is also proving expensive to businesses – one firm in the City recently spent £250,000 on contractor time recovering e-mails to support a case.”

So, what are the options? Well, you can, of course, use the information store of your e-mail system to maintain an archive, but this a risky strategy, as leaving tens of thousands of old messages sitting in the database will soon degrade its performance.

Niels Abildgaard, sales specialist with IBM System Storage Solutions, says, “Maintaining e-mails within the core e-mail production system will pose significant challenges from an IT operational point of view, which is reason enough to drive the need for e-mail archiving by reducing the size of the production e-mail systems.” 

The next option is to off-load old e-mails to storage media. This fulfils the technical requirement of keeping the e-mail server in good shape performance-wise, but means that searching for specific e-mails is not possible other than by relatively laborious processes, which fall short of the highest standards of compliance and waste storage space by, for example, duplicating numerous identical attachments.

So, businesses are increasingly opting for specialised e-mail archiving technologies, which come as software, hardware, services and combinations of all three. At their most basic, the key task of archiving technologies is to automate the process of storing e-mails from the live server after a given period. This ensures that the production e-mail system is kept trim to optimise performance, and that e-mails are retrievable in future.

As you go beyond basic levels of functionality, archiving systems begin to include features such as the removal of duplicate attachments, search functions (including the ability to search for attachments in some cases), tamper-proofing, audit trails and reporting.
Dedicated e-mail archiving systems generally fall into three types. First, the software-plus-storage approach, in which organisations develop or buy archiving software and any necessary hardware, such as servers and storage devices.

In this type of system, policy-based archiving software schedules the copying of e-mail from the information store to a repository, and software on client PCs enables search and retrieval. It is possible to knit together components from different suppliers or to buy all of them from a single supplier, such as HP. HP uses its own software to schedule and provision archiving using its grid-like Smartcells system (see Coda case study).

A more self-contained approach is the e-mail archiving appliance. Instead of buying software, servers and storage from one or several suppliers, all the hardware and software required to begin archiving is supplied in one box. Suppliers of e-mail archiving appliances include the Forensic and Compliance Systems (FCS) (see Guildford Council case study).

FCS’s history is in compliance tools, which it has merged with a hardware system. FCS claims “forensic” standards of e-mail retention that provide an unbroken record of e-mail and instant messaging communications at arms’ length from existing operational systems.
Finally, there is the outsourced approach, in which a third-party software interface connected to the messaging system captures and transmits messages to a service provider’s datacentre for storage.

This method is often hybridised, as is the case with Messagelabs’ service. Software and hardware is installed at the client’s site and is combined with secure storage managed by a third-party provider with full encryption, the keys to which are held by the customer.
Dennis Szubert, principal analyst with analyst firm Quocirca, says, “The advantage of this hybrid model is that because the archiving equipment is local, mailbox indexing, retrieval and similar functions are faster. The off-site provider holds the data, but not the key required to read it.”

Case study: Coda

Financial accounting software provider Coda has 550 employees and operates in 14 countries with 2,500 medium and large user organisations as customers. Annual e-mail traffic is estimated by group IT manager Richard Hall to be about eight million to nine million e-mails, to which spam “adds a zero”, he says.

“We started off by trying to deal with personal folders in Outlook. We used inbox limits and reminders to users, but we could not apply a corporate policy this way and searching at this level was impossible,” says Hall.

“We were dealing with back-up issues and ever-increasing storage demands, and we were spending hours on manual back-up and information retrieval,” says Hall.

“With millions of e-mails passing through the company every year, it was time and cost intensive to archive and search them, and we were aware that we could be leaving the business open to risk. Also, for e-mails and other documents to be admissible in court, you must be able to prove that items have not been tampered with.”

Coda opted to procure an HP-based e-mail archiving system in 2005, and expects to gain complete return on investment this year. The system is based on HP’s Reference Information Storage System (Riss). Riss is an HP technology that ties together software and hardware in a grid-like configuration of so-called Smartcells.

Data is stored securely, with date and time stamping of objects to mitigate risk and prevent tampering or changing of retained records. As e-mails arrive at Coda they are simultaneously stored in Coda’s Riss architecture, which is currently configured at 2Tbytes.
Attachments are archived out of Outlook after seven days, and after 30 days everything is auto-archived and single instanced, so that multiple copies of identical data are not kept.
“We used to have situations where people were spending hours every day searching for things. We had a case recently where we had to retrieve an e-mail from four years ago and were able to find it in seconds rather than go through old tapes,” says Hall.

Other benefits include cutting down on exchange maintenance, as well as significantly reducing time spent on disaster recovery testing. 

Case study: Guildford council

Guildford Borough Council has 1,100 e-mail users, and 90% of their communication with the public involves some form of electronic response. This volume alone is enough to generate a considerable e-mail archiving challenge, but, like all local authorities, Guildford has also had to contend with a barrage of legislation enforcing legal compliance.

The introduction of the Human Rights Act, Data Protection Act, Regulation of Investigatory Powers Act and the Freedom of Information Act placed new restrictions on the council’s right to monitor staff e-mail and strengthened the individual’s right to privacy.

Audit section manager Joan Poole soon discovered that Guildford’s e-mail system and privacy policy was out of date and not compliant with the legislation. “I looked at the existing software, which was working okay, but was totally non-transparent,” says Poole.
“I could go in and look at anybody’s e-mail. There was no trail. If you wanted to, it could be abused, and with privacy in the work place it was obviously in breach.”

In 2003 the council implemented FCS’s Cryoserver, and by 2005 was fully compliant. FCS likens Cryoserver to a black-box flight recorder for e-mail. It records every single e-mail ever generated by or sent to a Guildford-council account and has a Google-like search facility. It also creates a tamper-evident audit trail so an administrator cannot use Cryoserver to look at employee e-mail information without creating an audit trail that is automatically sent to three “data guardians”.

Besides compliance, a key benefit for the council has been the ability to resolve potential disputes. Because Cryoserver keeps a record of all e-mails, it is able to provide evidence that e-mails do not exist as well, as evidence for those that do. 

“We had a situation with one of our contractors where there was a dispute over damages claimed for an over-running contract,” says Poole. “The contractor was alleging that an extension of time had been agreed in an e-mail. I tracked Cryoserver and there was nothing there. It is very useful to be able to prove a negative.”

Guildford’s success in implementing e-mail archiving systems has seen it become one of only 28 councils out of 273 in England that have been awarded “excellent” status by the Audit Commission.

Read more on storage >>

HP and IBM in “statistical tie” >>

Data retrieval techniques for e-mail and databases >>

Tips for data archiving >>

Information security: the route to compliance >>

Read more on IT legislation and regulation