Nmedia - Fotolia

How database virtualisation helped the European Bioinformatics Institute save datacentre space

The European Bioinformatics Institute recently scooped the award for Best Datacentre Project at the CW European User Awards, and here's why

This article can also be found in the Premium Editorial Download: CW Europe: CW Europe – September 2015 edition

In its role as data custodian, it’s up to the European Bioinformatics Institute (EBI) in Cambridge to ensure the world’s biological information is freely available and accessible for use by the global scientific community.

The non-profit is currently sitting on 50,000TB of biological data, which is distributed across three UK datacentres. Scientists from academia and the commercial world are invited to use these datacentres for research, app development or staff training purposes.

According to EBI calculations, it fields around 12 million requests each month for access to this data worldwide.

Users can choose to download and analyse it locally or use EBI’s own infrastructure as a service (IaaS) to do the deed and – in turn – save themselves the hassle of finding somewhere to store it all on-premise.

“This is a fairly new approach and is feeding into the different ways the life science community wants to consume and deliver research. We’re seeing prototypes of this model coming out in a range of life science areas,” says Steven Newhouse, head of technical services at EBI.

With around 20% of the EBI’s 570 staff devoting their time to carrying out collaborative and investigation-led life science research, the amount of data the organisation has to manage is doubling every year.

At the current rate of generation, the data the organisation manages is expected to grow over the next five years to around 1.5 million TB, putting huge pressure on EBI’s data infrastructure and operations.

In anticipation of this, the organisation set out to virtualise its server estate, paving the way for a wider simplification of the relational and NoSQL databases that house the metadata used by researchers to locate information in its vast datasets.

Virtualise to economise

To achieve this, EBI chose to deploy Delphix’s data-as-a-service (DaaS) offering.

The technology captures and stores a single copy of the database’s metadata and then draws on it to provide virtual copies to interested parties as a service, such as developers, without duplication.

EBI predicts the deployment will allow it to cut its storage footprint by around 70%.

“As the amount of data we produce grows, all of the infrastructure we need to provision databases – and the number of them we need – will continue to scale up,” says Newhouse.

“What Delphix should enable us to do is lower the storage demand and the human resources needed to support that increase in data,” he adds.

According to Newhouse, the technology should also reduce the volume of data flowing across the organisation’s internal networks, while making the act of creating database instances quicker and easier to do.

“A lot of our internal customers are developers who use the metadata to build applications that can be accessed by the broader, external European life sciences community,” says Newhouse.

“Our developers need database instances to do their development work on and what Delphix allows us to do is rapidly clone databases,” he says.

This latter capability speeds up the time it takes the IT team to deliver database instances and updates to developers, allowing them to be more productive.

“Delphix allows us to see what the difference is between database releases, so we can synchronise that difference rather than the entire database, which should improve the time it takes to deploy new database releases,” says Newhouse.

“This will enable us to be more responsive and put more releases out each year than we currently feel able to commit to,” he adds.

Pilot projects

The organisation moved to deploy Delphix nearly 12 months ago, after embarking on a series of trials using the technology 3-4 years ago.

“We ran a couple of pilots and gained some experience and confidence in using that over the past couple of years. Then [we] essentially came to a decision point as to whether this was something we were going to kill-off or adopt on a larger scale,” says Newhouse.

When it came to the crunch, the feedback garnered by EBI users was good enough for a large-scale roll-out of the technology to begin in the organisation’s datacentres.

The time it took to come to this decision is no reflection on Delphix, says Newhouse, and is more to do with convincing the EBI teams that it was the right way to go.

“We have a very conservative community internally. For us to adopt something and for our internal community to pick up on it we have to commit to that technology for many years,” he says.

“Not only is there sceptical resistance, but we have to be very confident the technology is going to work and – once adopted – is something we can support going forward,” he adds. 

The decision to virtualise its server and database infrastructure has allowed EBI to automate more of its infrastructure processes. Newhouse is keen to capitalise on this to enable the organisation to adopt a more agile approach to application delivery in the future.

“We really want to see how our whole service can become more portable and more easily deployable across different virtualised infrastructures, be it private cloud or public cloud,” he says.

Read more about case studies

Read more on Cloud computing software