There is no single uniform approach to disaster planning or recovery. Each organization must establish plans and implement tools that are appropriate for its particular business model and compliance obligations. Regardless of your specific approach, however, disaster planning is not a one-time academic exercise. In actual practice, Disaster recovery (DR) plans often necessitate changes in the storage infrastructure and impose other overhead tasks that must be addressed. DR plans must also be tested and updated periodically to ensure that disaster plans remain relevant as the business grows or hardware changes. Let's take a look at the most pressing DR management issues.
Disaster recovery plans typically involve changes to an existing storage or network infrastructure. Ultimately, a storage administrator must budget and schedule the hardware, software, labor and facility costs needed to accommodate DR plans. Hardware additions may be as simple as adding a tape drive or tape library but often requires more substantial additions like dedicated storage systems. One example might be the acquisition of a NearStor virtual tape library from Network Appliance Inc. or an Axion backup/recovery system from Avamar Technologies Inc.
In most cases, backups intended for DR purposes are sent to a remote location. Services like Iron Mountain Inc. can transport physical tapes to a secure off-site vault, but an increasing number of organizations are practicing remote replication between storage systems at two or more locations. For example, a bank may use a WAN link to replicate data from one EMC Corp. Centera in its main data center to a secondary Centera located in a backup data center across the state.
DR doesn't work without software and usually involves one or more software applications, such as backup, snapshot, mirroring or replication tools. Some examples include EMC's Symmetrix Remote Data Facility software designed to replicate Symmetrix systems, as well as Avamar's Replicator software intended to replicate heterogeneous systems across a WAN. Whether software is bundled with the storage system or acquired separately, an IT staff must invest the time to become proficient with each tool. Smart managers will ensure that key IT personnel have the time to learn each tool.
Once the DR infrastructure is in place, it takes a serious effort to establish and maintain the backup. This may require an evening or weekend to make full backup tapes or synchronize data between replication sites across a WAN. After the initial replication, an IT department must allocate the time to tackle incremental tape backups or nightly replication.
You rely on backups to protect you against disaster, but are the backups themselves vulnerable to disaster? Whenever corporate data resides outside the direct control of an IT department, it's important to consider the implications of data security. Any remote location should start with an evaluation of physical security.
Tape storage or remote data center equipment should always be kept under lock and key -- accessible only to a minimum number of authorized personnel. Fire extinguishers and suppression systems should use gasses that are friendly to electronic equipment and digital media (water-based systems should be avoided). The geographic location should also be free from flooding, earthquakes and even potential terrorist targets. Feel free to inspect a remote facility in advance. If the facility is managed by another company, (such as Iron Mountain, take the time to discuss its security and disaster plans, and define its liability for your vital data.
The data itself may need to be secured through encryption techniques. As a rule, only personally identifiable information must be secured, such as customer records with Social Security or credit card numbers), though organizations that replicate data often choose to encrypt all data in order to maintain security across an open WAN (a.k.a. the Internet). Encryption can be handled through backup software or implemented through dedicated encryption appliances integrated into the network such as the DataFort product family from Decru Inc. See the SearchStorage.com
Tech Roundup on encryption tools
Testing and training
Even the best DR plan is useless if it cannot be implemented, so an important part of DR management is periodic testing and training, bringing new IT personnel up to speed on the DR process and verifying that recovery is achievable within the specified recovery time objective (RTO). Recovery drills can be tricky because they are disruptive -- a production network must be brought offline and recovered from the very latest disaster backup.
Some organizations avoid lost production time and the risk of unexpected problems by practicing with a test (lab) system. That is, a scaled-down environment is backed up and then recovered using the same means employed by the production network. While this tactic does not verify the actual network, it does provide important practice for IT personnel. Drills often include discussion time for personnel to evaluate the plan and make any recommendations to streamline or improve the DR process.
There are no solid guidelines that dictate how often a DR plan should be tested, though once a year is probably the minimum frequency. In addition to regularly scheduled testing, additional testing can be accommodated as needed when personnel turnover occurs or when changes to the DR plan are implemented. If you do business with a DR recovery service provider, you may need to schedule testing time in advance.
Updating the plan
Finally, DR plans are never static. Changes invariably occur with storage resources, applications, IT personnel and even business units or corporate practices. As changes take place, the DR plan must be updated to accommodate those changes. For example, if 200 GB of additional storage capacity is added or a new storage array is installed, that additional storage must be included in the DR cycle. As another example, new privacy legislation may require files to be encrypted where they may not have been encrypted in the past.
Changes can also have secondary effects on the DR plan. Consider the 200 GB of additional storage capacity added in the previous example. Since more storage will take longer to backup, it may be necessary to consider a different tape technology or increase the WAN bandwidth to maintain acceptable RTOs. For larger organizations, a system of change management may be needed to report on any organizational changes precipitating a possible adjustment to the DR plan.