Disaster recovery needs to shift to focus more on everyday threats. John Kavanagh looks at how to ensure business continuity that will definitely work, at an acceptable price
Terrorist attacks and floods may hit the headlines, but IT managers see the most likely causes of business interruption being everyday things such as software maintenance and network failure.
This finding from research by disaster recovery systems specialist SteelEye Technology – which puts terrorism, natural disasters and denial-of-service attacks bottom of the list of concerns – is backed by many experts who see a need to balance the worst nightmares against the reality of day-to-day IT and business life. Indeed, some say focusing on everyday potential incidents can actually provide better recovery from major disasters.
Business continuity plans and stand-by sites, whether owned or part of a disaster recovery service from a specialist supplier, seem to be the norm these days: 73% of organisations questioned by SteelEye Technology have plans, and 87% of those have stand-by sites.
Almost 65% have automated data replication, and 36% have automated processing switch-over. About 45% have had to activate their plan after an unexpected incident.
But beyond this traditional disaster recovery approach, thinking is changing in line with new threats, new ways of working, and new technology.
On the technology front in particular, fast communication, data compression techniques and storage area networks mean that back-up datacentres can and should be many miles away from the main centre, experts say.
Yet the survey shows that 39% of European companies have their stand-by centre in the same city. Specialists point out that in these cases big incidents could affect both the main and back-up centres; in addition, staff might not be able to get to either centre if transport was hit.
This point was underlined by John Milne, head of business continuity at the Financial Services Authority, at the recent opening of a back-up centre in Romford, Essex, by ICM Computer Group to support City companies.
“Companies need to appreciate the risks inherent in having their disaster recovery centres too close – some in walking distance,” Milne said.
“In a disaster ranging from terrorism to something as straightforward as a gas leak, many locations could be affected. Companies should seek a continuity centre which is geographically separate.”
The business need for continuity preparations is certainly greater than ever, experts say.
“Today’s rigorous market conditions mean that the risk to customer confidence, brand value, market position and the financial implications of being kept from doing business for any period of time are too great to be ignored,” says Ian Bond, a consultant at Cisco Systems UK.
“In addition, there is now pressure from national and international regulators to reduce corporate risk exposure.”
Bond points out that terrorist attacks are rare, but adds that they need to be planned for nonetheless. Other threats, although out of the ordinary, also need to be considered.
“Hurricanes might be few and far between in the UK, but floods are less rare. Snow has been known to knock out power supplies and isolate urban areas; an epidemic could put an area into quarantine.
“Then there are digital threats: malware and even user error, plus denial-of-service attacks, can take servers, branches and even datacentres out of commission for indeterminate periods.”
Even more mundane threats are highlighted by Chris Gabriel, business continuity specialist at systems consultancy Logicalis – and he is one who questions traditional disaster recovery approaches.
“Organisations do need a duplicate datacentre, but many focus other spending on things that are not likely to be needed, and they do not look at everyday things that hit their continuity,” Gabriel says.
“For example government statistics show that 176 million working days were lost through sickness in 2004, but two-thirds of this was when people were not ill enough to go to the doctor but just felt unwell and not fit to commute – and they would have worked at home if they could.
“There are issues of transport strikes, severe weather and traffic congestion delaying people getting to work. Child sickness can keep parents at home.
“In the South East people increasingly cannot afford housing, and it is becoming increasingly expensive to drive and commute, so getting staff is becoming a business continuity issue there.
“We need a shift in thinking beyond floods and bombs. Very few datacentres have been destroyed by such incidents. Think instead about flexible working and remote working, for example: extend your network into people’s homes – and they might do an extra hour at home in the evening even on a normal day.
“Such thinking about the many minor interruptions makes sense for major disasters too. If your head office is in London and you keep an empty stand-by office in Swindon, do you expect staff to commute there every day during a disaster? If there is a bird flu epidemic the advice is to stop people congregating anyway. So an empty stand-by office is no stand-by against individual sickness, an epidemic, a transport strike or severe weather.
“One utility company told us recently that it spends £1m a year on business continuity – and £600,000 of that pays for an empty office 90 miles away. When we asked why they might ever use it, they could think of nothing apart from the head office blowing up, which they saw as highly unlikely.”
Many organisations are looking at flexible working without thinking about business continuity, and if they took a broader view they could get a better return, Gabriel says. “The human resources department is thinking about flexible working, and IT is spending on duplicate datacentres and empty offices.
“Maintain the spending on a duplicate datacentre, but review the empty office and look at spending on day-to-day operations – which will also support you in a crisis. Companies want business agility and business continuity, and these can be covered by the same investment, giving the best return.
“Companies are also interested in social responsibility, and in particular green issues. Flexible working supports this too, by cutting travel, office space and related emissions.
“IT can take a lead in all three of these business areas – agility, continuity and social responsibility – and get closer to the business at last.”
The emphasis on business in all this underlines the shift in thinking in disaster recovery, from IT stand-by to business continuity. The Gartner research group talks about business impact analysis as a first step in business continuity, and points to its wider benefits.
“Business impact analysis clarifies business expectations by identifying critical business and IT processes,” says Roberta Witty, co-author of a new Gartner report on the topic. “It establishes the cost of downtime and determines recovery time objectives.
“But it also delivers other benefits: it identifies interdependencies between business processes and technology, and helps improve day-to-day business processes and their resilience. It also takes senior management through a decision-making process they may not have been through before, which helps educate them about their business and its risks.”
Senior management needs to be involved for other reasons too, according to Gartner. Just as with IT projects generally, senior management support is key to achieving buy-in throughout the organisation.
“A common mistake is to invest in continuity as a one-off project. Business continuity is evolving. For example, as an organisation’s network of service providers grows it is essential that it evaluates its partners’ business continuity to ensure that any weaknesses in its supply chain are identified and mitigated,” says Floris van den Dool, head of the European security practice at consultancy Accenture.
Best practice guidance in business continuity planning is now emerging, in particular a British Standards Institution standard. The draft standard BS25999 has just been release for public comment.
It consists of a code of practice and a specification for business continuity management. It is intended to replace the widely used PAS56, a Publicly Available Specification developed through the British Standards Institution four years ago.
Some organisations also recommend using the IT Infrastructure Library, which covers continuity management in the service delivery section. In addition, the Business Continuity Institute, formed in 1994, has detailed professional guidance and standards for individual specialists.
Whatever approach companies take to business continuity, the SteelEye Technology survey underlines the need for action. When asked how long the most important services could be lost before the downtime became potentially fatal for the organisation, just under 30% said as little as four hours, another 17% said between four and eight hours and a further 24% gave themselves up to 24 hours.
This means the stark reality of failing to prepare for a disaster – whether a flood or some careless software maintenance – is that nearly 75% of companies could go out of business in a less than a day.
The Buncefield blast: business continuity in action
After the huge Buncefield oil depot explosion near Hemel Hempstead last December, IT services company Steria had its headquarters so badly damaged that access was prohibited by the police for several days.
Yet by the end of the day of the explosion – a Sunday – the company, which has 400 staff at the site, was largely back up and running, and none of its critical business had been interrupted.
Steria’s business continuity plan had identified different priorities for different systems and services. Priorities were broken down by the hour for some, and by three hours, a day or a week or more for others. This meant managers knew what needed to be done in what order.
Steria encourages mobile working, so many staff were already equipped to work from home. Some staff without equipment were sent laptops overnight. Alternative office space was found for clerical staff.
Alternative IT meant Steria could keep important systems running and also keep staff updated via the company intranet and e-mail, plus the phone and SMS. Managers used a cascade system to filter information down.
Steria says its business continuity culture and service company ethos meant staff were willing to go beyond the call of duty. Some staff with broadband links at home set up makeshift offices for colleagues in their living room.
Basic questions to ask
- If the worst happens, what business processes are critical?
- What will be the cost of interruption for different processes?
- How much data can you afford to lose, and how quickly must different IT services be restored: minutes, hours, days?
- Does your continuity plan cater for these different demands?
- What is the business and IT command chain? What if it is broken?
- Which staff are vital and how will you contact them and keep in touch?
- How will you communicate with customers?
- Who are your suppliers and what are your dependencies on them?
- If you are considering a stand-by service, how many subscribers are there? How many are in the same geographic area?
- Are there other ways to do business?
Sources: Accenture, Gartner, ICM, Siemon
Vote for your IT greats
Who have been the most influential people in IT in the past 40 years? The greatest organisations? The best hardware and software technologies? As part of Computer Weekly’s 40th anniversary celebrations, we are asking our readers who and what has really made a difference?
Vote now at: www.computerweekly.com/ITgreats