Eight data migration tips

Learn how to make your next data migration project run better with these eight tips.

Whatever storage media your data sat on a year or two ago, chances are it's moved since then and will likely move again soon. There are plenty of reasons why that data may have to move: maybe the lease is up on an old Fibre Channel (FC) SAN and you're upgrading to new hardware, you're moving to a new data centre or you need to move older files to less expensive storage to keep up with soaring data demands.

Data migration may be a common chore, but that doesn't mean it's easy. Disk (and tape) drives are linked to applications and business processes through servers, routers, switches, and storage and data networks, not to mention access control policies and other layers of security. The more complex your environment, and the more data you're managing, the less likely you'll be able to use simple copy functions built into operating systems or arrays to pull off your required migrations.

Migrating data involves a lot more than just ripping out one storage cabinet and plugging in another. The following tips will help make your data migrations go more smoothly.

1. Understand your mapping

Before migrating any data to new storage arrays, be sure you understand how servers are currently mapped to storage so you can re-create those mappings in the new environment. Otherwise, servers may not reboot correctly after the migration.

To avoid unplanned outages, administrators should "understand the true end-to-end relationships among the platforms you're moving across," says Lou Berger, senior director of products and applied technologies at EMC. This is especially important if, for redundancy purposes, your storage infrastructure is a multipathing environment where hosts may boot from alternate arrays if the primary array is down. If administrators fail to check the parameters on the host HBAs to ensure the pathing software is set up correctly, he says, the host may not reboot properly.

Administrators also need to be sure the host will discover storage resources in the proper order after a migration. "Some applications and databases are sensitive to the order in which they discover volumes," says Berger, because an application boot sequence might be on one LUN and its data on another.

Administrators may not even know a server exists until it fails to reboot after a migration, "because oftentimes people install them and forget them," says Ashish Nadkarni, a principal consultant at GlassHouse Technologies. While storage discovery and auditing tools are valuable, he says, none of them can capture 100% of the misconfigurations that can cause a problem.

2. Gather metrics

Jalil Falsafi, director of information technology at electronic components distributor Future Electronics, had to migrate data from IBM DS4100 and DS4300 entry-level arrays to Hewlett-Packard (HP) StorageWorks XP24000 arrays during intervals of relatively slow network traffic over a period of six weeks. That required an in-depth understanding of the capacity of his SAN and when other functions, such as a database backup, would increase network loads.

"You have to scope how many LUNs, or logical disks, you're going to migrate. You have to know their size; you have to know the speed of your array; you have to know the speed of your switch as well as 'hot spots' when traffic loads are very heavy," says Future Electronics' Falsafi. "You need to take the worst-case scenario into consideration, not the average or the minimum."

Falsafi used monitoring tools available in FalconStor Software’s IPStor network storage server, as well as host- and array-based utilities, to gather those metrics.

"Migration can have a severe impact on overall system performance," says Chris McCall, product marketing director at LeftHand Networks Inc. (which is being acquired by HP). "It becomes a fairly nasty issue [with questions such as] 'Is my controller performance maxed out already or close to maxed?'" He warns that overloading a storage or data network with migration traffic can reduce the availability or performance of not only the data being migrated, but all of the data on the network.

Measuring network bandwidth needs before performing a migration is a chore that can be easily overlooked, says Greg Schulz, founder and senior analyst at StorageIO Group. "Unless you know for sure, go out and doublecheck to see what the impact is going to be," he says. Once an administrator is sure how much bandwidth should be allocated to the migration and when it will be available, the bandwidth can be managed with tools such as optimisation technologies, replication optimisers and traffic shapers, he adds.

3. Downtime isn't so bad

Some vendors claim they can migrate data without causing any downtime for applications. But some observers, such as Gary Fox, director of national services, data center and storage solutions at Dimension Data, recommend building in some downtime because it's tricky to migrate data and ensure its consistency while doing a migration during regular production hours. If possible, he suggests, do migrations during non-business hours "so you're not under so much pressure" in case something goes wrong.

"I'm kind of old school in this regard," he adds.

4. Watch for security leaks

When migrating data among arrays from various vendors, permissions and security settings can be left behind, making the data vulnerable to theft, corruption or misuse. Even moving data among file systems--say, from NTFS to NFS--can result in a loss of permission and security settings, says GlassHouse Technologies' Nadkarni. "If you're moving ... from Windows to Unix or Unix to Windows, you have to be very, very cautious because more often than not the user permissions are completely destroyed," he says.

The easiest way to avoid security issues is to do a block-level rather than a file-level migration. That way, the migration is performed at "a level below the file system, so the host doesn't even see the difference" in the data, says Nadkarni.

It's possible to maintain security settings in a file-based migration, he notes, if the source and target systems lie within the same authentication or authorisation domain in a service such as Microsoft's Active Directory. Some file-based migration tools also have the intelligence required to maintain such security settings, he notes.

Digging into the details of how a file copy utility works is important, says StorageIO's Schulz. "What does it copy? How does it copy? Does it simply copy the file, or copy the file as well as all other attributes, meta data and associated information? Those could be the real gotchas if you haven't brought along all of the extra permissions and access information. Dig into the documentation, talk to the vendor or service provider, and understand what type of data is being moved, and how it is to be moved."

5. Virtualise carefully

Host-based storage virtualisation, which is available from a number of vendors, is a fairly reliable way to accomplish such cross-vendor migration. Future Electronics' Falsafi says the host-based virtualisation provided by the FalconStor software made the actual migration painless. "We zoned the XP with a Fibre Channel switch so [it] came up as another set of hard disks to the IPStor. We created a mirrored LUN on the HP StorageWorks XP24000 array and did synchronisation. Once the primary array and the backup LUNs were synchronised ... all we did was flip the switch from the primary to the backup, and the backup became the primary," he says.

But not all virtualisation is created alike. Some virtualisation appliances can add to the work administrators have to do, or cause application outages while administrators update drivers or the volume managers used to manage the storage, says Nadkarni. For example, he says, a virtualisation appliance can cause problems by changing the SCSI Inquiry String used to identify a specific array. If the appliance changes the inquiry string, the volume manager used to manage the storage must be reconfigured to recognise the new string, he says, or applications that depend on that volume may not run properly. Storage admins should ask virtualisation vendors whether their products are "completely transparent," says Nadkarni, or whether their installation will require changes to servers or other components that could cause application outages.

Nadkarni also suggests staying away from virtualisation appliances that require an array or entire storage network to be taken out of service to virtualise (or unvirtualise) storage resources. Some appliances "may require you to take an outage to reconfigure your network or to take an outage on the entire storage array, to insert the appliance," he says. They can also require the administrator "to change things on the host" such as drivers, multipathing software or volume managers.

6. Thin provisioning

Thin provisioning helps preserve storage space by only taking up space on a disk when data is actually written to it, not when the volume is first set aside for use by an application or user. This eliminates waste when the application or user doesn't wind up needing the disk space. However, many data migration tools write "from block zero through to the very last block" of a volume on the target system regardless of which blocks are actually being used, nullifying the benefits of the thin provisioning a user had applied on the source array, says Sean Derrington, director of storage management and high availability at Symantec Corp.

File-system utilities or host-based volume managers "that are intelligent enough to figure out if the block is being accessed or not" before deciding to write to it can help circumvent this problem, says GlassHouse Technologies' Nadkarni. Block-level migration techniques that are good for preserving the security around data aren't good for preserving thin provisioning, he says, "because they write to the entire volume."

7. The devil is in the (software) details

Something as simple as different patch levels applied to software in the old and new environments can cause server crashes after a migration. Nadkarni says migrating among storage arrays also requires uninstalling the previous vendor's software from servers and installing the new vendor's. Not only does this require time, but it could cause instability if components left behind by the incomplete uninstall of older software conflict with other applications.

8. Build in enough learning time

If there's a common theme to these tips, it's that storage migration is complex and full of "gotchas" that can compromise application uptime, reliability or security. "The key to a successful data migration is not having any unknowns in your environment," says Nadkarni. "The more unknowns," he adds, "the bigger the risk." Storage administrators often underestimate the time required to learn their new storage environment and what it takes to migrate data to it successfully.

Besides the technical challenges involved in each data migration, it's also important to clearly understand the business objectives for the migration, says Terri McClure, an analyst at Enterprise Strategy Group in Milford, MA. For example, what's the ROI of the data migration? Is the aim to migrate seldom-used data to less expensive media to reduce disk and power costs, to decrease the data's RTO or both? If so, it may be possible to create automated storage policies to avoid an endless round of manual migrations, she says.

"To do anything successfully and seamlessly you have to do a lot of preparation, thorough preparation," says Future Electronics' Falsafi. "That means analysis, data gathering, trend analysis. For me, it's very vital you get this information and know exactly how your systems behave before you do anything. The cost of an unsuccessful data migration--interrupted business operations, and a loss of revenue and credibility--far outweighs the additional amount of time it may take to thoroughly understand your source or target environments."

Read more on Master data management (MDM) and integration