How To Tweak Data Backups In OneFS-based HPC Systems

High-performance computing systems have special requirements for optimizing backup. Here are some tips to speed up backups on OneFSbased HPC systems.

Many organizations in sectors such as oil exploration, space exploration and medicine deploy high-performance computing (HPC) systems to analyze data at far speedier rates than is possible with traditional computing systems. Jobs that routinely take thousands of CPU hours to complete on traditional computing systems can be done within a matter of hours on HPC systems. These HPC systems require access to the storage system simultaneously and, for this purpose, scale out storage systems best fit the bill. One such example is EMC Isilon Storage Systems that run on OneFS—a single file system that spans the cluster of storage nodes—enabling the data to be read simultaneously by HPC systems from the cluster nodes.

The volume of data generated and analyzed ranges from terabytes to pet­abytes. As the data volumes increase, so does the challenges to backup the data well within the backup window.

Most backup software solutions support system state backups. System state backup contains the backup of registry files and other system information that is required to rebuild the system in case of a crash. For Windows-based HPC systems, system state backups should be configured for the systems. For Unix-based HPC systems, root backup should be configured to rebuild the system during recovery from a crash.

As the data that is being analyzed resides on OneFS, the backup of OneFS needs to be configured. If the backup is configured at mount points from the HPC systems, then the data will travel from Isilon to the HPC server and then to the backup device. This method utilizes the LAN bandwidth for a longer du­ration, as data has to travel a longer path. To avoid this, configure network data management protocol (NDMP) backups, so that the data travels directly to the backup device. NDMP backups are supported by most industry-leading backup software.

The direct NDMP method is recom­mended for the backup of OneFS. With di­rect NDMP, the backup device is attached to the OneFS cluster through the backup accelerator device. This device backs up the data from the nodes to the backup de­vice directly, circumventing the LAN. This reduces the time required for backup.

If the direct NDMP method is used, remember to create directories on the OneFS, proportional to the backup devices.

Anuj Sharma

If the direct NDMP method is used, re­member to create directories on the OneFS, proportional to the backup devices. For example, if there are four backup devices attached to the backup accelera­tor, then create four directories so that four backup streams can be simultane­ously backed up, and each stream backs up to a dedicated backup device . This will improve the backup throughput.

In case the direct NDMP method is not feasible, indirect NDMP backup to­pology should be used. In this method, the data travels from the cluster nodes directly to the server where the backup device is attached.

If possible, a dedicated backup LAN or a separate VLAN should be config­ured for indirect NDMP backups in order to isolate production traffic from backup traffic and thus ensure optimum backup throughput. The LAN net­work should have at least 1 Gbps Ethernet connectivity between the Isilon cluster and the backup device for optimum backup throughput. 10 GbE con­nectivity is preferred between the Isilon cluster and the backup device for opti­mum NAS backup throughput.

Even if a backup accelerator is not being used, directories should be created on OneFS. The number of directories created should be proportional to the backup devices so that individual backup policies can be created for each direc­ tory and backed up to a dedicated backup device. This will improve the backup throughput, consequently reducing the time required for backups.

Instead of backing up the same data repeatedly during full backups, consider implementing an archiving solution that archives the infrequently used data to the backup device. Now when the backup software initiates the full backup, the unchanged backup data that has already been archived will not be backed up again. For example, one can apply a policy on the archiving software to stipulate that data that has not been modified for a period of one year should be archived to the backup device. Thus if, say, a whole petabyte worth of data meets this requirement, the backup window would reduce to that proportion­ate extent.


About the author: Anuj Sharma is an EMC Certified and NetApp accredited professional. Sharma has experience in handling implementation projects related to SAN, NAS and BURA. He also has to his credit several research papers published globally on SAN and BURA technologies.

Read more on Disaster recovery