When disaster strikes

Will you know what to do? Karl Cushing examines the best approaches to business continuity

Will you know what to do? Karl Cushing examines the best approaches to business continuity

Many companies still aren't doing enough to ensure business continuity and run the risk of not being suitably prepared the next time a catastrophe occurs, according to a recent Gartner/Society of Information Management survey. Similarly, IDC claims that the majority of European enterprises with global revenues of more than $100m (£66m) still have no formal business continuity plans in place.

Disaster recovery and business continuity have become big news over the past six months following the 11 September terrorist attacks in the US. The events of that day shocked businesses into realising the importance of having an adequate disaster recovery plan and sufficient back-up. However, it appears that the majority of the talk has failed to be translated into action.

The importance of ensuring business continuity cannot be overemphasised. Uptime is everything these days and companies have a customer service requirement to deliver uninterrupted services. Shutting down systems for reboots or essential maintenance is becoming less and less acceptable and there is increasing pressure on companies to complete systems maintenance without incurring downtime. On top of this, organisations are having to manage ever increasing amounts of data.

"Modern businesses of any size cannot afford to ignore the possibilities of major loss through natural, technical or artificial crisis that causes them to suffer interruption to mission-critical functions and processes, or damage to their reputation or brand name," says David Roberts, chief executive of corporate IT user group The Infrastructure Forum. "Such incidents are often unavoidable but through careful business continuity planning their impact can be minimised."

However, in terms of developing a business continuity strategy, companies may well be making a mistake by focusing on the events of 11 September. They constitute a wholly unpredictable event and in terms of business continuity and ensuring availability of services there are more immediate and less dramatic reasons why companies should wake up to the importance of such issues.

According to David Sedgwick, UK and Ireland storage business manager for Compaq's enterprise business group, focusing on disaster recovery is a big mistake. "Disaster recovery is the lowest level in the equation," he says. "By definition it is reactive."

He points out that it is often the mundane things like poor cabling or power supply that cause the biggest problems. "They can have just as much of an impact on a company's ability to deliver uninterrupted services as dramatic events like 11 September," Sedgwick says.

This point is also illustrated by Robin Gaddum, managing consultant at business continuity firm Guardian iT. Gaddum says a survey carried out by the company a few years ago suggested that, in the City of London, companies were about 40 times more likely to suffer disasters and interruptions to their business as a result of the efforts of their plumbers than from terrorists.

Gaddum believes the impact of 11 September on business thinking in the UK has been minimal, pointing out that organisations in the UK have lived with the threat of terrorist attacks for some time now. However, he admits that the events of that day have led to changes. "One of the things we're seeing is that auditors and insurance companies have a different attitude to business continuity now," he says. "They're starting to insist that these plans are up to date and tested and that's obliging companies to take it more seriously."

This trend is compounded by recent codes of practice and legislation such as the stock exchange's Turnbull report and the Cadbury report. According to Christopher Young, managing director of the Impact Programme, a development network for IT directors, "If you are the director of a business you have a legal liability these days to ensure your business will continue, especially if it is publicly listed."

However, it appears that the warnings are still, for the most part, going unheeded. "Many companies are not doing enough," says Wouter Senf, chief technology officer of business continuity provider Global Continuity. He says one of the biggest problems with business continuity strategies is that too many companies implement solutions and then neglect to test and reappraise them. "There are many companies that don't understand the importance of the IT infrastructure to the organisation," he points out.

Conducting regular rehearsals and disaster recovery mock-ups is vital, Senf says, and companies should do at least one a year, although he says he would do them much more often for peace of mind. The more complex the strategy, the more often rehearsals should be done to make sure everything goes smoothly when the time comes. There will be changes at every rehearsal that require minor tweaks, Senf adds, so do them regularly.

Young also stresses the necessity of rehearsals. "People need to know what to do if all hell breaks loose," he says. "You don't want everyone running around like headless chickens." The company needs to know where people are, including contractors, consultants and people on holiday and Young suggests using an off-site records office for people to call in the event of a disaster. He also stresses the need to form a small chain of command, "And that chain of command should be leading not managing," he says.

Young says the whole awareness-raising process should be supported by a concise disaster recovery plan - not a 700-page manual that will be left on a shelf. Young gives the example of one company that has a fold-up disaster recovery plan, like a train timetable, that fits into the wallet that contains the employees' identity pass.

Senf and Young stress the importance of procedures. "Probably the best way to improve uptime and robustness is to improve procedures," Senf says. "The most challenging part is having all the procedures in place to get back up again and then switch back."

However, unlike Senf who emphasises the danger of the business not understanding the importance of the IT side, Young says one of the most common weaknesses of business continuity strategies is that they are written by the IT department and focus on restoring systems when they should be governed by the business side and focus on restoring processes. For Young, the key to a successful recovery is to identify those processes and to understand what order to restore them in.

The fact is that any successful business continuity strategy needs input from both the business and the IT sides of the company. "It is vital that the planning of a business continuity programme should not be exclusively an IT department project - there must be commitment from the entire organisation," says Roberts.

Sedgwick agrees. "These solutions cannot be effective if they aren't signed up to by all aspects of the business," he says. "For a business continuity strategy to be effective it has to be embraced by the whole organisation and almost become part of the culture."

For Sedgwick, the crux of the matter is that managing, protecting and backing up the massive amounts of data a company generates these days has become a huge headache using existing technology and companies need to rethink operationally. And he stresses that any organisation delivering a service via its computer and data systems to a customer needs to address the issue of business continuity, regardless of size.

For small- and medium-size enterprises on tighter budgets, however, options such as mirroring a storage area network will probably be too expensive. This is where the application service provider (ASP) model comes in. As always, the important thing here is do your homework - all ASPs are not the same and service level agreements vary enormously. Spending a bit of time hammering out a comprehensive agreement and working with the third party to tailor the offering to your company's specific requirements might well save some serious browbeating and hair pulling further down the line.

There are options for smaller companies other than using ASPs, however. Senf suggests running more than one computer network or using tape back-ups and insists that mirroring data is not necessarily an expensive choice, depending on the technique used. He says one cheap option is to back up data on a server in the basement. "This will help a lot in increasing the availability of data and uptime," he says. Another benefit of such a simple option is that conducting disaster recovery mock-ups on such a system will be very quick and can be done often, he adds.

One important lesson that should be learned from 11 September is the importance of locating back-up systems at a safe distance, with the story of a company based in one of the twin towers which had backed up its systems in the other one. But although terrorist attacks are a factor, a more immediate threat comes from natural disasters like freaks of weather, volcanoes and earthquakes. In the UK though, not known for its natural disasters, it's not necessary to locate your back-up in the Highlands or the depths of the Norfolk Broads. Josh Krischer, vice-president and research director at analyst Gartner, recommends that in the UK a distance of 10km to 15km should be sufficient.

Gaddum agrees. Although he says the further the better, really, he admits that this isn't really practical. "You have to be reasonable about these things," he says. "If you're in London you don't want your back-up silos in Birmingham." Gaddum points out that if a member of staff accidentally deletes a file you don't want to have to travel to a far-flung second site. But the bottom line is really that multi-site companies have a better chance of recovering than single site ones.

Companies should base the level of sophistication of their business continuity strategy on the nature of the business and the potential damage that could be caused by the loss of intra-day data and downtime. Maintaining data availability is naturally going to be more critical to a bank or an airline than a company involved in manufacturing, and whilst relying on tape back-up, resulting in the loss of some data, will be acceptable for some companies, others will need to look at data mirroring.

"If three seconds downtime signals the death-knell for the business then you have to look at data mirroring," says Young.

Senf agrees, but also points to the importance of other criteria like buildings and their locations. Companies should look at how secure the buildings are and include the communications and power infrastructure, looking at locations serviced by more than one provider, he says.

In short, the basic message is that UK companies can no longer afford to bury their heads in the sand or trust to luck. The issue of business continuity is not going to go away and it will become increasingly important over the next few years. Regardless of what approach they take, companies in the UK need to take the issue of business continuity more seriously. Hopefully more of them will be looking forward to backing up soon.

Three phases of recovery
Stage 1
Robin Gaddum, managing consultant at business continuity firm Guardian iT says the first minutes and hours will see the company go through three key phases:
  • Emergency response - focusing on people and making sure they are safe and secure

  • Containment - limiting damage

  • Damage assessment - seeing what is still working and available and seeing what the damage is
    "What's important in the early days is good, clear command and control," he says.

Stage 2
The business continuity phase, which will roughly form the first five days, will involve:
  • Restoring critical business processes

  • Identifying what work and/or people have been lost

  • Re-establishing links with external suppliers and customers

At the end of this period the company will still be in an "artificial state", says Gaddum. It might be using equipment borrowed from a third party, housed in temporary accommodation and/or using fewer people.

Stage 3
The third and final stage involves the actual recovery itself, including dealing with insurance claims, getting in structural engineers and relocating to a permanent place of work.

Six steps to business continuity
David Sedgwick, UK and Ireland storage business manager for the enterprise business group at Compaq, offers the following six tips:
  • Form a dedicated business continuity management team, including representatives from the core areas of the company such as human resources, finance and IT
  • Develop a business continuity plan and a strategy. Determine what service level you need to deliver by determining what the customers want and what they'll be happy with
  • Remember that business continuity isn't just about protecting your data but the physical systems that house it
  • The next step is getting board-level buy-in. Sedgwick says this is important as effective business continuity measures like replicating data aren't cheap. He says IT directors can justify the expense however by presenting the board with a few what-if scenarios looking at the costs of downtime, loss of face, lost value from aborted transactions and damage to brand
  • Make sure staff are prepared. This is a part of best practice that's often left out, he says, and it is important to make sure skills and staff are replicated, as well as the data, and to invest in training

  • Rehearse. Sedgwick recommends that companies do one at least once a year, however, he says the timing should be dictated by organisational change. According to Sedgwick, one of the biggest factors affecting business continuity is change and managing change control is crucial. He recommends companies do an audit when they institute major changes, including cultural and structural changes, to see how they will affect the business continuity strategy.

Read more on IT risk management