Is your data too fat for your backup window?

Glasshouse Technologies (UK) consultant David Boyd looks at how to fit ever-increasing data volumes into acceptable backup windows.

While on a customer site recently, we had the interesting challenge of securing a 10 TB database to tape on a nightly basis. The data volume had grown exponentially over the past year and, as with so many environments, budgetary restrictions meant that snapshot and mirroring technologies were unavailable – off-host backups weren't going to be possible.

Fortunately, the server hosted a large Oracle database with minimal file system fragmentation and the server was well stocked with HBAs for both disk and tape. Some tweaking later, with a handful of dedicated LTO3 tape drives and some non-application-friendly memory tuning, we had a backup that took less than 14 hours – not ideal, but it was deemed adequate.

All this extra data that is created on a daily basis has to be worth something . . . doesn't it?
Our situation was made easier – in no small way – by the fact that the backup team didn't have any SLAs in place. Had the service owners demanded backups take no more than a 'standard' eight hours, we would have had problems with the available infrastructure.

Rapidly growing data volumes are a challenge that everyone in IT – in fact, everyone in the wider business – has to rise to. A 10 TB database is big but by today's standards is not considered huge. For the backup admin, increasing volumes of data in database and file servers create a daily headache. Tape technologies may be getting faster (as are line speeds) but reading, transporting and writing data from host disk to tape is becoming increasingly difficult. Sadly for us, data volumes don't increase in line with technology refreshes. Un-monitored backup windows gradually extend out and eventually cause knock-on problems which threaten business continuity.

In the scenario described above, we were aided by the fact that the customers hadn't specified service level agreements. But, SLAs with defined RTOs (recovery time objective) and RPOs (recovery point objective) are a great help. I've often come across environments with a 'best endeavours' approach to backups – but the problem is, it is just not possible to design and run an efficient environment within such intangible parameters.

Had the customers defined an eight-hour RTO, then the design for that environment would have looked completely different. To ensure that service could be recovered in that window, the engineering teams would likely have insisted that the application owners purchase additional disk for mirror copies and software to maintain database consistency. Backups would have been performed off-host on disk optimised for consecutive reads. For now, at least, tape speeds allow large databases to be backed up in a timely fashion, but with an increasing dependency on proper environment design.

There is one other question to ask when encountering large and growing data volumes, and that is data value. All this extra data that is created on a daily basis has to be worth something. . . doesn't it? How much of it is replicated from elsewhere and how much of it can, therefore, be recreated – rather than restored – in the event of a disaster? These are hard questions to ask, especially by anyone below middle management. They are even harder questions to answer, and the answers even more difficult to interpret. However, until those tricky questions are answered, you may find yourself having to back up ever-increasing data volumes, and that can only be done if the parameters are well-defined and the budget is adequately provided.

About the author: David Boyd is a senior consultant at Glasshouse Technologies (UK), a global provider of IT infrastructure services. He has more than seven years experience in backup and storage, with a major focus on designing and implementing backup solutions for blue chip companies.

Read more on Data protection, backup and archiving