This article is part of our Essential Guide: Essential guide to cloud management

HPC research cluster get Red Hat OpenStack private cloud

Petabyte-scale eMedLab consortium opts for private cloud on Red Hat Linux OpenStack with hybrid Cinder and IBM Spectrum Scale storage, and rejects object and cloud storage

EMedLab, a partnership of seven research and academic institutions, has built a private cloud 5.5PB high-performance computing (HPC) cluster with Red Hat Enterprise Linux OpenStack using Cinder block storage and IBM’s Spectrum Scale (formerly GPFS) parallel file system.

The organisation rejected use of object storage – an emerging choice for very large capacity research data use cases – and also rejected use of the public cloud because of concerns over control and security of data.

EMedLab built the HPC cluster – in conjunction with Sheffield-based HPC specialist OCF – to provide compute resources to researchers working on genetic susceptibility to and appropriate treatments for cancers, cardio-vascular and rare diseases. Researchers can request and configure compute resources of up to 6,000 cores and storage for their projects via a web interface.

The HPC cluster is hosted at the JISC datacentre in Slough. It comprises 252 Lenovo Flex System blades with 24 cores each and 500MB of RAM. These each run a KVM hypervisor upon which virtual machines are created for research projects, while private cloud management functions come from Red Hat Enterprise Linux OpenStack.

Storage – to a total capacity of 5.5PB – is a combination of OpenStack Cinder block storage and IBM Spectrum Scale. Physical storage capacity comes via Lenovo GSS24 and GSS26 SAS JBODs with 1.2PB on a faster scratch tier and 4.3PB of bulk capacity on larger SAS drives.

Connectivity is Ethernet with 10Gbps Mellanox NICs.

EMedLab rejected object storage, which is available in OpenStack as Swift and increasingly popular for the storage of large amounts of unstructured data.

That’s because the eMedLab cluster needs the Posix file system compatibility of GPFS married to the private cloud capabilities – such as multi-tenancy – of OpenStack and its Cinder block access storage method, said Bruno Silva, operations lead for eMedLab.

Read more about HPC storage

“It’s a non-standard approach, to provide flexibility,” said Silva. “OpenStack provides storage to virtual machines through Cinder and Cinder utilises files within Spectrum Scale, which gives us a Posix file system to share data between virtual machines.

“In OpenStack we opted for Cinder block storage and not Swift because we don’t yet have workloads designed for object storage. GPFS does, however, have object storage on its roadmap. We’ve also had discussions about Manila [OpenStack’s file access storage system] but we’re not convinced it is mature enough to use yet.

“We could have gone for object storage if we were certain we were going for a pure OpenStack approach. But the system we’ve gone for also has GPFS/Spectrum Scale to give good performance for Posix and Cinder.”

The choice between public and private cloud

During the procurement process, the private cloud/OpenStack approach was weighed against buying in a more off-the-shelf HPC system, as well as use of public cloud.

“All the options were close in terms of quality, but there were various constraints in our requirements and with the OCF system we found a better fit,” said Silva.

“We preferred a private cloud system because it gave us control over our infrastructure and we know it is secure. We’re keeping an eye on public cloud. We know it is coming, but it has to overcome hurdles in terms of cost and regulatory concerns. But for the work we’re doing private cloud is most appropriate at the moment.”

Read more on Cloud applications