The Sanger Institute on using datacentre upgrades to help decode the human genome

Paul Woobey, IT director at the Wellcome Trust Sanger Institute, tells Computer Weekly why the organisation's HPC workload requirements cannot be fulfilled by a move to the cloud

The cloud is the inevitable destination for many enterprise applications and workloads still hosted in corporate datacentres, but it would be wrong to assume it is good fit for every organisation. And that is certainly true of the Wellcome Trust Sanger Institute (WTSI).

Despite having three datacentres at its Cambridge campus approaching 90% capacity, ICT director Paul Woobey is in no hurry to migrate any the organisation’s life sciences workloads to the cloud any time soon.

Much of this reluctance relates back to the sheer amounts of data generated by WTSI’s ongoing efforts to decode and sequence the human genome and, in turn, advance the life science community’s understanding of how our DNA affects our health and wellbeing. 

The three datacentres hold about 35PB of genomic data, as well as 26,000 processing cores, and any attempt to move all of that to the cloud could take years, he says.

“With our bandwidth on the network, which is 10GB to the cloud, it would take years to transfer the data and would be very costly,” he tells Computer Weekly.

The institute is only able to do the work it does thanks to the grants and funding it receives from the Wellcome Trust and others, and Woobey fears cloud could affect the organisation’s ability to keep its spending under tight control.

“The uncontrolled costs of these cloud facilities is a concern because you can rack up a £100,000 bill before you know it and that makes me nervous, because I can imagine our scientists getting carried away [with using it] and looking at the bill afterwards and going, ‘woah!’,” he says.

“The uncontrolled costs of these cloud facilities is a concern because you can rack up a £100,000 bill before you know it”

Paul Woobey, Wellcome Trust Sanger Institute

Having its datacentres sited so close to the DNA sequencing machines brings latency, efficiency and performance benefits that moving to the cloud would all but eradicate, says Woobey.

“It’s much better to have the datacentre next to sequencing machines than have them write that data into the cloud, because it would go across the network and there would be added costs, a time delay and we wouldn’t be able to do the experiments we do on the data as quickly as we do now,” he adds.

There are also regulatory reasons for needing to retain the institute’s datasets on-premise, particularly as the data protection landscape is shaping up to become even more complex with the arrival of the EU’s General Data Protection Regulation (GDPR) in 2018.

“The regulatory environment we are moving into, and are in already, prohibits us from moving this data across country boundaries,” says Woobey.

“Because you can’t really tell where the stuff in the cloud is going to be, it makes me nervous, but that is really a secondary concern compared to the cost.”

If the cloud fits

But that is not to say the cloud is a complete no-go area for the institute, he says, as off-premise technologies are used to support collaboration between its researchers and stakeholders all over the world.

The distinction lies in the fact that its teams are not sharing huge amounts of raw, genomic data via the cloud with other researchers around the globe – they are simply sharing the results of their own experiments. And in Woobey’s eyes, that is a permissible use case.

“I’m very nervous about moving data around,” he says. “I’m much happier about allowing people to come in and run experiments against our data [in the cloud]. They won’t know what the data is or who it belongs to. They will just see the statistical output.

“That’s OK, because you’re not moving the data out to them because we can’t afford it, for one thing, and legislatively it’s not on.”

Against this backdrop, it should come as little surprise that the WTSI is in the process of freeing up extra compute capacity on campus by constructing a £7.6m, 250m2 datacentre, rather than by using the cloud.

The server farm will fill out the last remaining square of an area known at The Quadrant, with its other three datacentres filling out the remaining units.

“There are 250m2 in each one, but one of the squares is empty and the other three are full,” he says. “So we have 750m2 used, and another 250m2 to go – and that’s what we’re building out now because the capacity of our existing facilities won’t last us another five years.”

Three is not the magic number

Woobey, who joined the institute three years ago as IT director, says it was originally envisioned that the three datacentres it has now would be sufficient to see the organisation through the lifetime of the project when they were built 12 years ago.

But a lot has changed since then, including a marked ramp-up in performance of the gene sequencing technology it uses and the amount of data that is generated as a result.

There is a pretty linear relationship between the amount of storage we need and the speed of the sequences and the size of the experiments we can do
Paul Woobey, Wellcome Trust Sanger Institute

“The sequencing machines produce a string of different molecules associated with the DNA that equates to about 200GB of data, which is then written to our datacentre in one way or another, but those machines have become faster and faster over the past few years,” says Woobey.

“So, whereas it took a couple of weeks to get a sequence of molecules out that tells you everything you need to know about that DNA, it now takes a week, and it is ramping up all the time.”

The grants that the institute receives to fund its work are gifted every five years, and the fact that the sequencing machines can do more in a shorter time plays well with that.

“Our grant stays the same [over time] and we can do more and more with the same money, because the sequencers are the same price and we can do twice as much with them,” he says.

“There is a pretty linear relationship between the amount of storage we need and the speed of the sequences and the size of the experiments we can do.”

Over the next five years, and with the help of the new datacentre, the organisation should see its compute power and capacity double.

“We’ll have 70-75PB of data and about 50,000 cores, all geared towards carrying out DNA mapping,” says Woobey.

Higher-density datacentres

The fourth datacentre will be a slightly different beast to those the institute currently relies on, and will be kitted out with higher-density server racks capable of accommodating IT loads of up to 40kW.

“Normally we have 20kW per rack, but because of the density of the computing we have to do and the power of the chips we need to accommodate, we will be using a 40kW rack design to create a high-performance computing environment,” says Woobey.

“We use roughly 1MW to run the other three quadrants and 1MW will be needed to run the fourth on its own.”

The new datacentre, which is set to come online in January 2018, will be kitted out with diesel rotary uninterruptible power supply units (DRUPS) as part of a push by the WTSI to improve the resilience of its infrastructure.

Read more about datacentre investments

When the other three quadrants were designed and built, datacentre resilience and redundancy was less of concern, but, with 1,500 people on campus alone directly relying on the output of its server farms, it is now an essential consideration for the new facility.

“The data is becoming much more critical than it was before, with so many scientific experiments relying on it being there, so we need this quadrant to be up and running all the time and having no downtime, so it’s been designed to be fully redundant,” he says.

With a move to the cloud vehemently out of the question, Woobey says that should the institute find itself short of capacity again in future, the prospect of building another datacentre has not been ruled out.

Realistically, given how much extra capacity the fourth quadrant will have at its disposal, the organisation could take one of its three existing sites offline and revamp it, and bring its design into line with what its new facility will be capable of, says Woobey.

“That was part of the thinking when it was originally designed – that we would have one quadrant left fallow or spare, so we could shut one of the others down while we renew it and get it up to capacity before opening it up again,” he says. “At the moment, though, I’m just focusing on bringing this fourth quadrant on board.”

Read more on Clustering for high availability and HPC