Disk-based backup overview
- Posted:
- 17:17 06 Aug 2007
- Topics:
- Storage Management Software | Data Storage Hardware | Storage Management | Networked Storage
Disc backup systems
Backups are all about time. It can take several hours to back up a large server to tape. Backing up an entire data center to tape can take 12 hours or more. Since most network data is inaccessible during a backup process, the procedure is typically performed in the evenings or off-peak hours to minimise service interruptions to end users.
With hard disc costs falling, administrators realise that disc-based storage can accomplish the same backup tasks in just a fraction of the time needed for tape, while being cost competitive with tape systems. Not only does this reduce the backup window, but those same backup discs can also speed recovery times (RTO). Once data is on a disc-based backup system, RAID techniques ensure data integrity and prevent data loss in the face of disc failures. These are significant benefits for busy organisations that rely on 24/7 network operations.
|
||||
Disc vs. tape
Making an argument for tape is becoming increasingly difficult. Tape is a mature and relatively inexpensive offline storage technology, but it does have limitations. Tape is generally slow, so large backup and restore operations can take hours -- even days. Searches are often impractical unless the user knows exactly which tapes and file names are required -- difficulties that increase dramatically when there are hundreds or thousands of tape cartridges to contend with. The media itself is vulnerable to loss or theft, especially when tapes must be transported to off-site storage.
Disc-based storage has emerged as a cost-effective alternative to tape, allowing companies to retain huge amounts of data on inexpensive SATA or SAS disc arrays. Disc's superior performance supports quick backups and restorations, and improves the user service level by making corporate data available "nearline." Disc arrays also leverage RAID techniques to maintain data integrity -- if one disc fails, the data on that disc can be rebuilt to a spare drive. Although disc systems often cost more than tape libraries, users note that the total cost of ownership is generally comparable to that of tape libraries. One notable disadvantage to disc is that ordinary media (the disc drives) cannot be removed and shipped to off site storage unless the storage system employs specially designed removable disc drives. When ordinary discs are employed as a local backup target, the data is usually replicated off site across a WAN to ensure disaster recovery or business continuance.
Virtual tape libraries
Although tape systems are hard pressed to meet shrinking backup and restore windows, many companies have a significant investment in tape-oriented backup software and management tools. Rather than discarding the tape paradigm and re-architecting a backup system from scratch, administrators often deploy virtual tape libraries (VTLs) as a more convenient solution. A VTL stores data on hard disc, but it appears to be a conventional tape system to backup software and hierarchal storage management (HSM) tools. Administrators can create "virtual tape drives" and segment virtual "tape cartridges" in the disc space, allowing the VTL to mimic a tape library in every way. This gives VTLs far better performance than true tape libraries, but provides easier integration with existing storage infrastructures with a minimum learning curve. IBM, Sun (StorageTek brand) and FalconStor Stofware Inc. are three recognised VTL vendors.
|
||||
Emerging backup technologies
Disc storage is being deployed in a variety of advanced backup tasks. Disc arrays are implementing data deduplication (a.k.a. intelligent compression or single-instance storage) for backup targets and archiving platforms such as content addressed storage (CAS). Single instance storage is notable because only one copy of a given file is ever retained -- even when multiple copies of a file or attachment are being backed up. Deduplication can offer a disc space savings up to 50-to-1. Many CAS archiving systems also include fingerprinting technologies that ensure files are unmodified, which is beneficial for electronic discovery and other corporate governance needs [see the SearchStorage.com article on CAS].
Discs are also providing improved backups using snapshot and continuous data protection (CDP) techniques to lower recovery times. Snapshots are periodic "saves" of system data, which can be initiated several times each day -- even once an hour. System administrators can recover a system from its latest snapshot on disc. CDP offers more granularity; recording each I/O operation into a disc record, and allowing administrators to recover a system right down to the last read or write with virtually no data loss. All of these technologies are impossible with tape.