University tiers HPC cloud with HDS, Spectra Logic and QStar

Oslo University builds academic cloud for research data scalable to exabytes with HDS disk, Spectra Logic tape and QStar archive

The University of Oslo has spent £2.3m on the NorStore storage network for Norwegian researchers that can scale to Exabytes, based on tiers of Hitachi Data Systems (HDS) disk and Spectra Logic tape with a QStar NAS file system front-end.

The first round of the NorStore project started in 2009, when Oslo and Trondheim Universities shared storage resources. Researchers, including life and climate scientists, astrophysicists and chemists, shared unstructured data from model runs on high performance computing (HPC) systems via a 10Gbps link over the 500km distance.

The first round of the project was based on a pair of SUN StorageTek 6540 SANs in each location, with file access via IBM’s GPFS parallel file system. But this first iteration of NorStore’s storage was “like a big bucket”, said Hans Eide, department head for research computing at the University of Oslo.

“Also data was mirrored between the two sites. This meant all data was duplicated and it was not a true backup; if something was deleted at one site it was also deleted that night at the other,” said Eide.

Capacity was also an issue and, two years ago, Oslo University began to offload mirrored data to tape when the arrays became full. “By the time money became available to upgrade the SANs and increase capacity, it was better for us to go for a new solution,” said Eide.

After extensive evaluation of the systems available, the university opted for disk capacity from HDS with tape capacity from Spectra Logic and a QStar archiving/file system overlaying access to tape.

In the new NorStore setup, a combination of four HDS Hitachi Unified Storage VM Controllers and a two-node High Performance NAS (HNAS 3090) provide primary storage of around 4Pb capacity; 75% is for research data while the remainder serves university administration.

Data is tiered between SAS and SATA drives, according to a “heat map” of usage on a 24-hour cycle.

Meanwhile, around 3.6Pb of nearline data resides on a Spectra Logic T-Finity tape library with 918 slots with four TS1140 drives. This is front-ended by a server running QStar’s proprietary TDO (Tape/Disk Object) file system.

The QStar server provides about 14TB of disk cache with access to the multi-petabyte tape back end. “To the user it just looks like a large file system,” said Eide.

In this it is similar to the open source Linear Tape File System (LTFS) that can enable “tape NAS”, in which a server front end provides a file system for access to data held in tape libraries.

The key benefits to the university have been cost savings and ease of use, said Eide.

“We use iRODS [Integrated Rule-Oriented Data Management Solution] to provide easy access to data for researchers via a web browser. Also, tape is cheaper than disk if you don’t have performance requirements. We expect a lot of data not to be hot and also we can potentially grow to Exabyte capacities without adding complexity,” said Eide.

Read more on SAN, NAS, solid state, RAID