Many organizations in sectors such as oil exploration, space exploration and medicine deploy high-performance computing (HPC) systems to analyze data at far speedier rates than is possible with traditional computing systems. Jobs that routinely take thousands of CPU hours to complete on traditional computing systems can be done within a matter of hours on HPC systems. These HPC systems require access to the storage system simultaneously and, for this purpose, scale out storage systems best fit the bill. One such example is EMC Isilon Storage Systems that run on OneFS—a single file system that spans the cluster of storage nodes—enabling the data to be read simultaneously by HPC systems from the cluster nodes.
The volume of data generated and analyzed ranges from terabytes to petabytes. As the data volumes increase, so does the challenges to backup the data well within the backup window.
Most backup software solutions support system state backups. System state backup contains the backup of registry files and other system information that is required to rebuild the system in case of a crash. For Windows-based HPC systems, system state backups should be configured for the systems. For Unix-based HPC systems, root backup should be configured to rebuild the system during recovery from a crash.
As the data that is being analyzed resides on OneFS, the backup of OneFS needs to be configured. If the backup is configured at mount points from the HPC systems, then the data will travel from Isilon to the HPC server and then to the backup device. This method utilizes the LAN bandwidth for a longer duration, as data has to travel a longer path. To avoid this, configure network data management protocol (NDMP) backups, so that the data travels directly to the backup device. NDMP backups are supported by most industry-leading backup software.
The direct NDMP method is recommended for the backup of OneFS. With direct NDMP, the backup device is attached to the OneFS cluster through the backup accelerator device. This device backs up the data from the nodes to the backup device directly, circumventing the LAN. This reduces the time required for backup.
If the direct NDMP method is used, remember to create directories on the OneFS, proportional to the backup devices.
If the direct NDMP method is used, remember to create directories on the OneFS, proportional to the backup devices. For example, if there are four backup devices attached to the backup accelerator, then create four directories so that four backup streams can be simultaneously backed up, and each stream backs up to a dedicated backup device . This will improve the backup throughput.
In case the direct NDMP method is not feasible, indirect NDMP backup topology should be used. In this method, the data travels from the cluster nodes directly to the server where the backup device is attached.
If possible, a dedicated backup LAN or a separate VLAN should be configured for indirect NDMP backups in order to isolate production traffic from backup traffic and thus ensure optimum backup throughput. The LAN network should have at least 1 Gbps Ethernet connectivity between the Isilon cluster and the backup device for optimum backup throughput. 10 GbE connectivity is preferred between the Isilon cluster and the backup device for optimum NAS backup throughput.
Even if a backup accelerator is not being used, directories should be created on OneFS. The number of directories created should be proportional to the backup devices so that individual backup policies can be created for each direc tory and backed up to a dedicated backup device. This will improve the backup throughput, consequently reducing the time required for backups.
Instead of backing up the same data repeatedly during full backups, consider implementing an archiving solution that archives the infrequently used data to the backup device. Now when the backup software initiates the full backup, the unchanged backup data that has already been archived will not be backed up again. For example, one can apply a policy on the archiving software to stipulate that data that has not been modified for a period of one year should be archived to the backup device. Thus if, say, a whole petabyte worth of data meets this requirement, the backup window would reduce to that proportionate extent.
About the author: Anuj Sharma is an EMC Certified and NetApp accredited professional. Sharma has experience in handling implementation projects related to SAN, NAS and BURA. He also has to his credit several research papers published globally on SAN and BURA technologies.