sakkmesterke -

Enterprise accessibility: How Cray is using HPC to open up AI use cases from the datacentre

Enterprises interest in HPC is reportedly being fuelled by the rising demand for artificial intelligence-based applications and services, and Cray is one of a number of providers looking to cash in

The past few years have seen almost every major technology firm talk up the potential for artificial intelligence (AI) to transform the enterprise, providing the skills and compute barriers blocking progress can be overcome.

On the skills front, Google, Amazon, and Microsoft, for example, have all set out plans to make the technology more accessible to enterprise tech teams, while encouraging users to use their respective cloud platforms to handle the compute side of the equation.

There are also signs to suggest the demand for AI also fuelling interest in the use of high-performance computing (HPC) technologies, as enterprises seek out the processing power they need to make their AI ambitions a reality from other sources.

Bhushan Desam, Lenovo’s global AI business leader, recently told Computer Weekly about how enterprise use of AI technologies is causing the traditional HPC user base to broaden.

In recognition of this trend, the company set out plans in 2017 to invest $1.2bn over the next four years for AI-related research and development.

“HPC is going to accelerate AI, but not just in the HPC community,” says Desam.

“Healthcare organisations are using HPC to crunch millions of images in image-based diagnoses. In manufacturing, HPC is used to not only run engineering simulations, but also to predict when something will go down, so they can minimise downtime and improve operational efficiency,” he adds.

HPC manufacturer Cray is another industry player, looking to tap into the growing demand for supercomputing resources from AI adopters. 

The company recently debuted CS-Storm GPU-accelerated server series, and – in doing so – expanded the range of fast-start AI configurations it offers to datacentres and machine learning users.

The new system comprises a CS-Storm 500NX 4-GPU server, and a 1U server with two Intel Xeon CPUs and four NVIDIA Volta GPUs. Also including support for NVIDIA NVLink™ SXM2 GPUs, this significantly widens the range of servers available to organisations aiming to develop new AI-based applications and services.

With this addition to Cray’s series of CS-Storm GPU-accelerated servers, the Seattle-based firm is now offering three different GPU-accelerated HPC systems with three different configurations of processors.

Not only does such expanded configurability form part of the reason why the HPC market is expected to grow to anything between $44bn and $48bn by 2022/23 (7% compound annual growth rate), but stands to give the growing enterprise AI market a boost too, given the dependency of high-level AI applications on supercomputers.

Scalable configuration choices

In terms of what the arrival of the 500NX 4-GPU system means for the CS-Storm series (and for Cray’s roster of products as a whole), Fred Kohout, the company’s senior vice-president of products, says the HPC company means datacentres should now be able to support a wider range of AI applications.

“Cray provides multiple platforms, expertise, choice and performance, from scale-out clusters to scale-up supercomputers, along with storage and data management solutions that enable predictive modelling and data-driven discovery,” he adds.

According to Kohout, the CS-Storm server family is intended to be used for applications requiring dense CPU: GPU ratios (2:4 or 2:8), such as those involving deep-learning neural network training.

Cray also provides a variety of AI-focused products and services that go beyond HPC, he adds, that are just as important for companies looking to develop and implement AI technology.

“To make it easier for organisations to move from smaller proof-of-concept projects to ‘at scale’ production deployments, we provide a full software stack designed for the rapid development and execution of AI applications,” says Kohout.

By this, he means the Cray Programming Environment suite of programs and its Bright Cluster Manager, which allows users to manage the various components of their clustered HPC infrastructure.

Such a range of tools can provide much-needed support to companies using the CS-Storm series as they develop their AI plans into full-blown projects.

“The CS-Storm 500NX 4-GPU system provides another scalable configuration choice for customers as they consider and plan their AI system architecture. The Cray CS-Storm server family is designed to optimise compute density in a highly scalable and performant manner,” he says.

“The design enables configurability and a wide set of options to tailor the platform to the customer’s particular needs to enable the best balance between compute, accelerator compute and network bandwidth based on the needs of the workload,” he adds.

Deployments lagging behind designs

Most significantly, Kohout notes that different “AI use cases require unique combinations of machine intelligence tools, model designs and compute infrastructure”.

It’s this variability that’s important in the current climate, since it’s arguable that progress in AI deployment and development has been constrained in the past by datacentres not having access to appropriate systems.

For example, recent research from Gartner shows that interest in deploying artificial intelligence is still markedly higher than actual deployments, with only 4% of surveyed CIOs having implemented AI, while 46% plan to introduce it at some point in the future.

An earlier release from Gartner also found that CIOs regarded AI as the most “problematic” technology to implement, ahead of cyber security measures and the internet of things (IoT).

While reasons for failing to introduce AI will no doubt vary from company to company, it’s safe to assume many have been deterred by a lack of systems configured to their particular needs, as well as by a lack of supporting software that would help them operate AI-enabled high-performance computers. 

However, it is not Kohout’s view that the AI industry has been held back by an absence of adequate or sufficiently powerful supercomputing systems.

“We see [the 500NX 4-GPU’s release] more as an opportunity to educate non-traditional HPC customers on the benefits of HPC systems applied to AI applications,” he adds.

Market growth for HPC and AI

In other words, the 500NX 4-GPU represents an occasion for Cray to show that HPC isn’t just for a select elite of companies pursuing the most advanced AI applications, but can also be attuned and adapted to the goals of a wide variety of firms performing a variety of tasks.

“Implementing machine and deep learning in many organisations is a journey – from investigation to proof of concept to production applications – that data science and IT teams undertake,” says Kohout.

“Cray’s philosophy is centred around supporting customers today with their pilot and proof of concept projects and serve as a trusted partner as they look to expand and scale their AI efforts in the future.”

The expansion of the CS-Storm series therefore comes at an ideal moment, addressing the wide gap between the aims and achievements of firms in the area of AI implementation.

This gap has also been noticed by other HPC companies, with Super Micro Computer, Hewlett Packard Enterprises, and Dell EMC releasing comparably adjustable systems in the past 12 months. Meanwhile, Cray itself teamed up with Microsoft in October to bring supercomputing-as-a-service onto the Azure platform.

It will be largely by virtue of such releases that the global HPC industry will reach the kind of $48bn figures predicted by analysts. By extension, the enterprise AI market will be better placed to hit its own $6.1bn growth projections in the years to 2022, as increasingly flexible HPC solutions allow companies to fit AI to their needs, rather than forcing them to fit their needs to AI.

Read more about HPC deployments

  • High performance computing has not historically been seen as a good candidate for colocation. However, things change – now may be the right time to consider such a move.
  • High performance computing specialist aims at analytics- and transaction-heavy workloads in the enterprise with its upgraded Intel Broadwell-powered SFA array, the 14KXi.

Read more on Clustering for high availability and HPC

Data Center
Data Management