London 2012 Olympics have ended, and the thousands of athletes who excelled at the events are already reviewing their performance and preparing for the next Olympics four years hence. So why should data centre disaster recovery and IT contingency planning fizzle out post-Games?
To cope with the potential disruptions such as network outages, pressure on bandwidth and remote employees, data centre managers planned their
- If there were any shortcomings when data centre energy suppliers were supporting the Olympic organisation, then communication, power and infrastructure system suppliers would be exposed globally. They had to ensure that their own IT systems were robust to deliver business as usual during the Olympics and that their contingency planning included strategies for the foreseen and unforeseen risks.
- Work patterns changed as employees revised their commuting travel and worked from home or other office locations. Systems designed for office workers were extended to permit access from home. New pressures were placed on security procedures as users were permitted to use their own systems rather than standard issue PCs.
Business continuity and contingency plans had the spotlight on them. In most cases, they stood up to the test. In some situations, additional expenditure was approved to cover the worst cases. IT at other organisations believed everything was already in place, trusting enterprise networks and system infrastructure to not fail.
Data centre Olympics guide
As London braces to host the second biggest event this Summer -- London 2012 Paralympic Games -- this London Olympics data centre guide will help you deal with the potential data centre issues during the Games.
Establishing and managing a contingency plan is a closed loop process. Post-Olympics is a good time to review to what level the business was put at risk and what solutions were implemented to mitigate against any disaster.
IT pros can review their data centre disaster recovery and contingency planning with the following five steps:
- Be aware of IT risks and review this regularly when applications are upgraded and implemented
or when the infrastructure changes. Within this step, data centre managers should:
- Assess the risk landscape and reflect this against the business risk.
- Evaluate the maturity levels of applications and system components to determine if alternative practices need to be implemented for established processes.
- Identify and prioritise key applications so that if or when there is a system outage, the
business can be brought online again with the least impact on the enterprise.
- Quantify the business impact so that IT has a clear understanding of what keeps it operational
- The business impact analysis enables IT to communicate with all business units about the risk and exposure if or when there are planned or unplanned outages.
- In order that each application is not treated in isolation, IT pros should institute application “classes” so that a manageable process and procedures can be established for everyone within the organisation.
- Applications can be classed as mission critical apps, business critical apps, business
important apps, function important apps and not critical apps.
- Devise the data centre contingency planning appropriate for each application within a
service-level framework aligned with the application classes.
- Define recovery objectives and establish how they will be implemented within agreed recovery times.
- Put in place a full technical solution design to cover each and every application or system
component. Take into account the maturity of each of these applications. For example virtualisation
admins will have to address server deployment, application support, virtual desktop support as well
- Align IT with business value and implement the contingency strategies.
- IT pros must align the solutions with the needs of each business unit and implement the solutions, clearly communicating with all those responsible within IT and within the business.
- They must ensure that there is a full test of the recovery strategy executed on a regular
basis. This should be in line with the system changes which are implemented once or twice a
- Build and manage a unified capability across the business. Implementing an overarching IT risk
management governance programme will help IT pros ensure that the investment in data centre
contingency planning ins adequate.
- This unified capability may include third-party suppliers.
- With plans aligned to the business, all parties will better understand their roles and responsibilities, how the processes operate to mitigate risk and how this is reviewed across the five key points outlined here to meet with evolving business priorities.
Reviewing the contingency plans and developing unified strategies for the business will ease IT’s burden in addressing issues around platforms and applications on which the business depends.
IT admins can take other measures while reviewing their data centre disaster recovery and contingency planning by checking the following:
- Web services, supporting external and internal processes
- Email services, to which every business is critically dependent
- Server racks which need to support flexible deployment or realignment of resources, especially within virtualised environments
- Storage resources, secured and available to deliver access to the data required by applications wherever they are running within the virtualised server platforms
- Data centre fabric which must be able to support the virtualised servers, ensuring access to all storage resources, including file-based storage resources connected to the LANs
- Desktop services which may be deployed as desktop virtualisation or use other connectivity architectures, making sure they are secure and able to follow the applications
Third-party suppliers are always involved in and around the data centre. By reviewing processes and contracts with these organisations, data centre admins can reduce risks and unplanned expenditure. Areas in which these apply include:
- Availability of power supplies and standby power services
- Data and voice communication services in and out of the organisation
- Maintenance services to diagnose and remedy issues
- Migrating data and applications to alternative data centres
Contingency planning is essential and must be financially effective. Over-investment in such services is as bitter a pill to swallow as is under-investment. Experience has shown that there are real gains to be achieved by addressing key practices such as:
- Reduction in the cost of disaster recovery and contingency services
- Improved alignment of services and support between IT and the business
- Standardisation of IT services
- Acceleration of IT consolidation
- Reduction of IT security risk, leading to a sustainable improvement in business practices
Building a flexible data centre requires a new approach. Plans and actions need to be proactive, not reactive; automated processes need to be implemented to anticipate, analyse and prevent system outages rather than respond to gut feel and then search for a system fix; risk assessment needs to be holistic, systematic and collaborative. No longer can islands operate on an ad hoc basis and be unconnected from the rest of the organisation.
Just as our Olympians are using computerised techniques to analyse and improve their performances, businesses need to implement automated tools to improve their IT’s performance.
With thousands of events per second, measuring system events, analysing impacts and identifying solutions whether within server complexes and racks, storage arrays and data management or across the fabric and networks is becoming an ever more complex task. Only up-to-date data centre disaster recovery and contingency planning can beat IT complexity.
Hamish Macarthur is the founder of Macarthur Stroud International, a research and consulting organisation specialising in the technology markets. He is a regular contributor to SearchVirtualDataCentre.co.UK.
This was first published in August 2012