Disaster recovery and business continuity: A best practice guide
In this guest post Paul Timms, managing director of IT support provider MCSA, shares his thoughts on why enterprises can ill-afford to overlook the importance of business continuity and disaster recovery.
With downtime leading to reputational damage, lost trade and productivity loss, organisations are starting to realise continuity planning and disaster recovery are critical to success.
Business continuity needs to be properly planned, tested and reviewed in order to be successful. For most businesses, planning for disaster recovery will raise more questions than answers to begin with, but doing the hard work now will save a lot of pain in the future.
Ascertain the appetite for risk
All businesses are different when it comes to risk. While some may view a ransomware attack as a minor inconvenience, dealt with by running on paper for a week while they rebuild systems, whereas others view any sort of downtime as unacceptable.
The individual risk appetite of your organisation will have a significant impact on how you plan and prepare for business continuity. You will need to consider your sector, size, and attitude towards downtime, verses cost and resources. Assessing this risk appetite will let you judge where to allocate resources, and focus your priorities.
Plan, plan and plan some more
To properly plan for disaster recovery, it is critical to consider all aspects of a business continuity event, together with the impact of it, and how to mitigate these.
For example, if power goes down in the organisation’s headquarters, so will the landline phones, but mobiles will still be functional. A way to mitigate this impact would be to reroute the office phone number to another office or a mobile at headquarters. To do that you need to consider where you store the information about how to do that, and who knows where it is.
This is just one example. You need to consider all the risks, and all the technology and processes that you use. Consider the plan, the risk, the solution and where you need to invest and strengthen your plan to ensure your business can still function in the event of a disaster.
Build in blocks and test rigorously
Ideally IT solutions will be built and tested in blocks, so you can consider the risks and solutions in a functional way. You can consider for example your network, WAN/LAN, storage and data protection.
Plans for each then need to be tested in a rigorous way with likely scenarios. What if, for example, a particular machine fails? What happens if the power supply cuts out? What happens in the case of a crypto virus? Do you have back-ups? Are they on site? Do you have a failover set-up in case of system failure? Is the second system on site or in a completely different geography? What do I with my staff – can they work from home? Are there enough laptops?
These will drive out and validate (or not) assumptions on managing during a business continuity event. For example if your company is infected with a crypto virus and has infected the main servers, it will also have replicated across to your other sites, therefore your only option is to restore from back-ups or have a technology solution that allows you to roll back before the virus was unleashed.
Cloud is not the only answer
It can be tempting to think cloud can solve all the problems, but that is not always the case. Data in the cloud is not automatically backed up and is not necessarily replicated to a second site. These are options on some public cloud services, but they are often expensive and under used, as a result.
Despite being cloud-based, a company’s Office365 environment can still get a virus and become corrupted. If you have not put the IT systems in place to back that data up, then it will be lost. If for example, the cloud goes down, you need to consider a failover system.
The interesting part of this is the public cloud doesn’t go down very often, but when it does it is impossible to tell you how long it will be out of action for. Businesses must therefore consider when to invoke disaster recover in that instance.
Know your core systems
One solution that some companies adopt is running core systems and storage on infrastructure that they know and trust. This means knowing where it is and what it is, so it meets their performance criteria. Businesses also consider how this system is backed up including what network it is plugged into, ensuring it has a wireless router as standby, that the failover system is at least 50 miles away on a separate IP provider, that the replication is tested and that the data protection technology works and is tested.
This gives business much better knowledge and control in a business continuity event such as a power outage. Businesses can get direct feedback about the length of outage meaning they have better visibility and ability to make the right decisions.
Plan and prioritise
When making a plan you need to consider the risks and your objectives. A useful approach towards technology can be to consider how it can help mitigate these risks and help you meet your objectives. When considering budget, there is no upper limit to what you can spend, instead focus on your priorities and then have the board sign them off.
Spending a day brainstorming is a good way of working out what concerns your organisation the most and what will have the most detrimental impact should it go wrong. Needless to say, something that has a high risk of impact needs to be prioritised. In terms of the executor of any business continuity plan, as the saying goes, don’t put your eggs in one basket – involving numerous people and hence ensuring more than one person is trained in the business continuity plan will significantly mitigate the impact of any event.