University College London (UCL) has installed around 3PB of DataDirect Networks (DDN) object storage as part of a project to provide storage and archive services for academic researchers at the institution.
UCL started to consolidate storage on a DDN clustered NAS (GridScaler) and object storage (WOS) infrastructure about three years ago. In 2013 the university had 450TB of clustered NAS on the (IBM-developed) GPFS clustered file system-powered GridScaler and 50TB in an experimental WOS object store.
Since then, head of research data services Max Wilkinson has overseen an expansion of the WOS object storage of 24 extra nodes with 60 3TB drives to a total of around 3PB, while the clustered NAS component has remained at the same capacity.
URDS is deployed across two UCL datacentres, with three WOS replicas across the two sites.
The GridScaler clustered NAS component of URDS provides high performance for users that need to move a lot of data around quickly for analytics workloads.
Meanwhile, the WOS object store provides lower cost, about one-third of GridScaler, data storage for those with less rigorous performance needs, said Daniel Hanlon, data storage architect at UCL.
“Clustered NAS is 3x more expensive per TB but offers performance of 160Gbps,” said Hanlon. “We get nothing like that from object storage. But the management overhead is a lot lower.
More on object storage
“If a disk goes on GridScaler you have to rush to replace it. In object storage as soon as a disk fails, the system starts to replicate it. You can go around and replace failed disks once a month if you want.”
A year ago object storage formed the minority of URDS storage. Now it has expanded to form the bulk of capacity. Why?
Wilkinson said: “Initially we had a lot of early adopters that were familiar with clustered NAS and object storage was a new technology. Since then it’s been a big consideration to reduce the administrative overhead.”
But Hanlon said object storage is more flexible: "We don’t, for example, have to decide up front how we will do replicas. With clustered file systems you have to decide all that in advance and then it’s difficult to modify. With object there are lots of layers that can be tuned.”
Object-based storage does away with the need for traditional hierarchical file systems, which can become unwieldy at large data volume and large numbers of files.
Instead, data is organised in a flat file system, with each object having its own unique identifier in a similar way to DNS on the internet.
For that reason, object storage is touted as being suited to large datasets, but it is early days and has yet to see widespread acceptance.
UCLs 3PB of DDN object storage is likely to see further expansion as currently only about 500 out of 5,000 researchers have data hosted on URDS, with most hosting locally in departmental storage or on individual drives.
Wilkinson’s team has adopted a softly-softly approach and has not ordered researchers to move data to URDS.
Wilkinson said: “In an academic environment if you mandate something it’s almost certainly going to be ignored. And so far it’s an approach that’s proven successful; it’s a hungry market.”
“Some big users have large storage requirements and have their own infrastructures but we expect them to come over to us as they reach the end of tech refresh cycles.”
The next stage of UCL’s storage project will be an archive service for researchers that will launch in first quarter of 2015 and will be based on an LTFS tape infrastructure from a third party service provider.