Paul Calleja, director of the high performance computing service at the University of Cambridge, maintains that supercomputers should be accessible to as wide a range of businesses and academic researchers as possible.
And that means one thing – building supercomputers from commodity chips, rather than buying in specialist hardware that need specialists to programme.
Paul Calleja oversees one of the world’s most powerful computers, the Darwin supercomputer cluster, ranked this month at number 93 in the Top 500 list.
With 9,600 Intel Sandy Bridge processor cores, the machine is capable of running calculations at 200 Teraflops, equivalent to 200 trillion calculations per second.
It is not the fastest supercomputer in the UK, said Calleja. There are faster machines at Edinburgh, Daresbury and the Met Office. But it is certainly the fastest scalable machine built from commodity Intel chips.
“It’s a real state of the art Intel-based cluster, and it's really been designed to be as good as you can make it, technology-wise,” Calleja told Computer Weekly.
Cambridge’s high-performance computing service provides resources to academic researchers and high-tech businesses in the Cambridge area.
The Darwin HPC cluster
There are three parts to the cluster:
- Sandy Bridge - 9,600 Intel Sandy Bridge cores running on 600 quad server Dell chassis;
- Westmere - 1536 Westmere cores running on 128 dual socket Dell blade servers;
- Tesla GPU subcluster - 128 GPUs running on 32 dual socket Dell T5500 servers.
Source: Cambridge HPC Service
The service has helped research teams in academia and business make important breakthroughs in areas as diverse as theoretical physics, gene sequencing and jet engine design.
“Increasingly large-scale computation and simulation is being used as the third paradigm for research. You have theory and experiment, and now simulation really is the link between the two,” he said.
Calleja joined the Cambridge high-performance computing (HPC) service from the HPC centre Imperial College six years ago.
He took the decision to replace Cambridge’s propriety supercomputer hardware with supercomputers based on commodity Intel chips.
Until that point, the service had relied on propriety machines from companies including IBM, Hitachi and, more recently, Sun.
“I come from a commodity supercomputing background and I have a clear vision that a commodity cluster could be as powerful as propriety systems. And in the UK we demonstrated that for the first time,” he said.
The result was a supercomputer that ran significantly faster than the IBM Regatta machine at the UK's national supercomputing centre in Daresbury – then the fastest machine in the UK - and yet was significantly cheaper.
“There was a big difference in price performance,” he said.
Calleja also changed the computing centre’s business model – transforming it from a free service funded centrally by Cambridge university to a commercial service, open to both academics and businesses.
The shift to a commercial service has meant investing heavily in improving the support and service levels the department offered to businesses and academics.
“University service departments are not known for good service, so we changed the way our staff work and had a heavy focus on customer service, being nice to our customers and making them feel comfortable,” he said.
- The Cambridge HPC service operates as a pay-per-use service.
- The university uses Moab, a management tool from Adaptive, to manage the work load and manage billing.
- The Moab software allows the service to manage computing jobs from 800 users in 30 departments, with utilisation levels of 87%.
Cambridge still runs a free service for academics, with a lower quality of service, and caps on usage, alongside its paid service.
Commercial realities hone efficiency
But as service levels have improved, the balance has shifted from 10% paid work, to 80% paid work.
“There is an advantage to the university because we are self-sustaining and, in my opinion, charging encourages efficiency of usage,” said Calleja.
“But it also makes us efficient because we have to compete for services.”
Moving to a commodity Intel infrastructure has opened up supercomputing in Cambridge to a much wider cross section of researchers.
“Many academics already have their own small Intel clusters, or are doing calculations on their own Intel workstation,” Calleja said.
“So its very easy for these guys to migrate to a supercomputer cluster."
Academic and business applications
This accessibility has contributed to a ten-fold increase in the use of supercomputer services at the Cambridge centre over the past ten years.
At the same time the university has benefited from a three-fold increase in its output of peer-reviewed papers on high-performance computing, with 370 peer reviewed articles published over the last four years.
Research teams are using the high-performance computer to make breakthroughs in materials modelling, using quantum mechanics.
And the service is helping astronomers investigate the origins of the universe, analysing the background microwave radiation that permeates space.
Commercially, Cambridge has worked with the Lotus Formula 1 team to model the aerodynamics of racing cars; and with a small graphics arts company, to render the special effects in the latest Planet of the Apes movie.
More commercial work is expected to follow, as Cambridge prepares to merge its supercomputing centre with Imperial College’s HPC centre.
The combined HPC service, dubbed Core, will share hardware, capital resources and systems staff.
“The idea is we can form a bigger critical mass of resources and, because we have a bigger critical mass, we will be able to better turn those resources to industry,” said Calleja.
At a time when university funding is being cut back, that could be an astute move.