Ernesto Lobo, a data recovery engineer for Kroll Ontrack Inc. based in Epson Surrey, specialises in virtual recoveries. He discusses some of the data recovery challenges posed by virtual environments.
Have you seen an increase in the number of data recovery requests in virtualisation adoption in the U.K?
Ernesto Lobo: Yes, indeed. In fact, we have seen a tenfold increase in the number of jobs in virtual environments in the last year.
Why has this data recovery increase occurred?
Lobo: I think it is due to VMware, one of the biggest vendors for virtualisation products, becoming so popular over the last couple of years. I am following companies moving from testing phase to production phase with virtual environments.
What risks does virtualisation pose to organisations?
Lobo: One of the reasons to choose virtualisation is consolidation, but it is also one of the biggest weaknesses if you don't have a good business continuity strategy. When you have all this data consolidation, if you have any kind of data loss situation, you are losing a whole lot more data than you would in an environment where you have only physical servers.
There are particular types of applications that would still prefer physical servers. Applications that are very input/output-intensive in terms of storage space would maybe need a mixture of virtual and physical devices. You may want to keep physical storage dedicated to a database server, for example, if you want to have maximum input/output speeds on that particular server.
The most common causes of data loss we have seen include deleted virtual machines and files on VMFS [Virtual Machine File System] volumes, RAID failure or configuration problems and reformatted or re-installed VMFS volumes.
What is the role of human error in data loss incidents in virtual environments?
Lobo: One of the most common types of data loss situation that we see is when whole virtual machines are deleted. Where before a company could lose a single server due to a RAID problem, in a virtual environment 20 to 30 servers can become inaccessible at the same time.
We did a job for a company in the U.K. banking industry where they were carrying out some maintenance work to their virtual environment. They had a replication system set up between the main and disaster recovery site, but before going into maintenance work they didn't stop their replication work and they managed to format one of the data stores containing probably between 15 and 20 virtual servers.
When that happened, the replicated site also got formatted, and at that point they realised they didn't have a recent backup for an Oracle database containing very critical data. They first contacted the storage vendor and VMware, and when they were not able to get the data back, VMware came to us. On this occasion, we had to go on-site because of the sensitivity of the data and the complexities of the job, and within 24 hours we were able to get the Oracle server back.
We also had a company from the construction industry in the U.K., and they deleted a virtual machine that had a Windows Server 2003 guest running a SQL server. They simply right-clicked on it and deleted the server, and they deleted the files from a 1 TB VMFS Partition. Once you have deleted the files on VFMS, the system leaves no trace of it. But we have a number of utilities that we have developed with VMware to work on these types of cases.
If data is more vulnerable in virtual environments, should companies be dissuaded from virtualisation adoption?
Lobo: No, not at all. [Virtualisation] technology [is very new], and everybody is just getting used to it. But it is developing very quickly. It is especially important to adapt business continuity strategies to the new technologies.
What should companies do if data loss occurs in their virtual environment?
Lobo: We see very often that people restore data from backups and once they have restored, for example, a terabyte of data on the drives where they had a data loss, then they realise that the backups weren't good at all. And at that point, it is a lot harder to get at the original data.
It is very important to analyse the data loss situation [in the virtual environment] first and work out the best approach to get the data back up and running. Definitely contact the vendors, software and hardware, and just get the more experienced people involved, especially in this new environment, where things may seem easy but they are not. If you have a data loss situation, trying to fix things yourself can definitely make things worse.
The [need] for data recovery could be avoided if companies were aware of the differences between the virtual world and the physical world. Your backup strategy and approach needs to be adapted to take into account the virtual servers that you are backing up, because you can't back them up the same way as you would physical servers; otherwise you run into difficulties when you try and restore.
About the interviewee: Ernesto Lobo has been the Kroll Ontrack Inc. virtual data recovery specialist in the UK-based recovery facility for nearly three years. With over five years of virtualisation experience, Ernesto has been involved in the development of Kroll Ontrack's data recovery solution for virtual data and remote data recovery.