NicoElNino - Fotolia

Bristol and Cambridge to host 2024 AI supercomputers

Dell and HPE are working on supercomputers with the two universities as part of the government’s artificial intelligence funding drive

Bristol and Cambridge are to become home to two of the UK’s newest and most powerful supercomputers – Isambard-AI and Dawn.

Earlier this year, the government announced an investment of £225m to create the UK’s most powerful artificial intelligence (AI) supercomputer with the University of Bristol and Hewlett Packard Enterprise (HPE). The funding is part of a £300m package to create a national Artificial Intelligence Research Resource (AIRR) for the country, which was announced at the government’s AI Safety Summit at Bletchley Park this week.

This supercomputer, called Isambard-AI, will connect with another new supercomputer cluster in Cambridge, called Dawn, which is being co-designed by Intel, Dell Technologies and the University of Cambridge.

Dawn will be hosted at the Cambridge Open ZettaScale Lab. According to Dell, it will be the most powerful AI supercomputing cloud and will run scientific OpenStack cloud software developed with UK SME StackHPC.

Dell’s system is based on Dell PowerEdge XE9640 servers, 4th Gen Intel Xeon Scalable processors and Intel Data Center GPU Max Series accelerators. Thanks to liquid cooling and versatile configuration, Dell said the server system was well-equipped to handle the demands of AI and high-performance computing (HPC) workloads. According to Dell, direct liquid cooling technology provides more efficient and cost-effective cooling than traditional air-cooled systems.

On the software-side, the HPC uses Scientific OpenStack from UK SME StackHPC, which provides a fully AI- and simulation-optimised cloud supercomputing software environment. This is combined with the oneAPI open software ecosystem and optimised frameworks that help developers speed up AI and HPC workloads, and enhance code portability across multiple hardware architectures.

Paul Calleja, director of research computing services at the University of Cambridge, said: “Dawn Phase 1 represents a huge step forward in AI and simulation capability for the UK, deployed and ready to use now. The system plays an important role within a larger context, where this co-design activity aims to deliver a Phase 2 supercomputer in 2024, which will boast 10 times the level of performance. If taken forward, Dawn Phase 2 would significantly boost the UK’s AI capability and continue this successful industry partnership.”

The system from HPE will be configured with 5,448 Nvidia GH200 Grace Hopper Superchips, which combine Nvidia’s ARM-based Grace CPU with a Hopper-based GPU optimised for power efficiency and scale. The hardware uses HPE Slingshot 11 interconnect and nearly 25PB (petabytes) of storage using the Cray Clusterstor E1000 optimised for AI workflows.

Like the Dell hardware, Isambard-AI will also feature direct liquid-cooling capabilities as part of the HPE Cray EX supercomputer design to improve energy efficiency and overall carbon footprint. HPE said the system would be hosted in a self-cooled, self-contained datacentre, using the HPE Performance Optimised Data Center (POD), situated in the National Composites Centre (NCC) at the Bristol and Bath Science Park.

HPE is also collaborating with the University of Bristol on a heat re-use model, extracting waste heat from the Isambard-AI system to use as renewable energy to heat local buildings, supporting the UK government’s 2030/2040 net-zero carbon-efficiency targets.

“Collaborations like the one between the University of Cambridge, Dell Technologies and Intel, alongside strong inward investment, are vital if we want compute to unlock the high-growth AI potential of the UK,” said Tariq Hussain, head of UK public sector at Dell Technologies.

“It is paramount that the government invests in the right technologies and infrastructure to ensure the UK leads in AI and exascale-class simulation capability. It’s also important to embrace the full spectrum of the technology ecosystem, including GPU diversity, to ensure customers can tackle the growing demands of generative AI, industrial simulation modelling and ground-breaking scientific research.”

Read more about the AI Safety Summit

Read more on Clustering for high availability and HPC

Data Center
Data Management