Vasyl -

AMD hits pay dirt in the cloud with EPYC server platform

Chipmaker AMD is looking to regain lost ground with its EPYC server platform, and it seems the cloud provider community is getting on board with its plans

The momentum surrounding AMD’s EPYC server processors has been gradually building since their launch earlier this year, with products based on the technology emerging in the enterprise and high-performance computing (HPC) space.

The chipmaker has now received a further boost from Baidu and Microsoft, with the latter becoming the first major public cloud to offer virtual machine (VM) instances powered by EPYC processors on its Azure platform.

Microsoft signalled its intent to offer EPYC-based cloud services at AMD’s launch event in June 2017, but these promises have finally come to fruition with the availability in preview of new storage-optimised L-series virtual machines that run on EPYC-based hardware.

“We have worked closely with AMD to develop the next generation of storage-optimised VMs called Lv2-Series, powered by AMD’s EPYC processors,” said Microsoft’s director of Azure compute, Corey Sanders, in a blog.

The Lv2-Series is designed to support customers with demanding workloads such as MongoDB, Cassandra and Cloudera, which are storage-intensive and demand high levels of I/O, he said.

This is a good match for the EPYC’s capabilities, according to Scott Aylor, corporate vice-president and general manager of enterprise solutions at AMD.

“These L-Series instances tend to be those that are focused on very high levels of storage coupled with high-performance compute, which is a great fit as it relates to the attributes where EPYC is strong,” he said. “So when you look at our industry-leading core density and memory bandwidth, coupled with our very strong I/O capability with industry-leading PCI Express, it means they can now address a massive amount of storage for things like large databases and in-memory analytics, looking at things like NoSQL. This where the L-Series plays very strongly.”

Microsoft is using two-socket servers based on the EPYC 7551, a 2GHz version of the processor with 32 cores. The Lv2-Series instances come in sizes ranging from eight to 64 cores, the latter also featuring 512GB of memory and eight 1.9TB SSDs for storage.

AMD said it could not disclose how many EPYC-based systems Microsoft is deploying to service the Lv2-Series VMs, but Aylor said AMD expects Microsoft to expand its use of EPYC-based systems to underpin more of its services in future.

“We anticipate this being a continued broad and deep relationship where there are many different instance types where EPYC may be of value,” he said.

Microsoft’s endorsement of EPYC for Azure could see the platform being adopted by other cloud providers, and Chinese internet giant Baidu is known to be a customer, this time using single-socket EPYC servers to drive compute-centric workloads as part of its ABC platform (AI + big data + cloud).

AMD hits comeback trail

The EPYC processor was billed as something of a comeback product as AMD looks to regain ground lost to rival Intel since the heyday of the Opteron chips a decade ago, when the firm boasted a 25% share of the server market.

Even AMD realised this would be a stiff challenge, withForrest Norrod, senior vice-president and general manager of AMD’s enterprise group, saying it had to “focus on areas where Intel was either coming up short or wasn’t interested in delivering”.

What this has meant in practice is supporting a large number of compute cores (up to 32 per socket) for mainstream two-socket and even single-socket systems, while leaving the four- and eight-socket segments of the market to Intel.

AMD also gave EPYC more memory channels, enabling up to 2TB of memory per socket, and a greater I/O capacity with 128 lanes of PCI Express, which can be used to connect multiple devices, such as GPU accelerators or NVMe flash storage drives, directly to the processor, while the lanes can also be configured as SATA ports to connect drives that use this host interface.

This is one reason why many early designs using the EPYC have focused on storage servers or on applications such as machine learning and big data analytics, making use of multiple GPU accelerators, such as those demonstrated by AMD at the recent SC17 supercomputing conference.

Mounting Olympus

To drive its Lv2-Series VM instances, Microsoft is using server hardware based on its Project Olympus design.

This is an “open” rack-mount server design, based on specifications the Redmond firm developed under the aegis of the Open Compute Project (OCP), with the intention of contracting a hardware manufacturer to build for its datacentres.

Microsoft declined to name who is making the server hardware, but it is likely to be an ODM supplier such as Wiwynn or ZT Systems, both of which already offer commercially available Project Olympus-based hardware.

Beyond the Azure and Baidu deployments, AMD disclosed a few months ago that it had secured major new datacentre customers in China Tencent and, using EPYC-based servers from Sugon and Lenovo.

Read more about datacentre chips

Meanwhile, in the enterprise arena, in November HPE announced plans to start shipping an EPYC-based version of its ProLiant DL380 2U rack-mount system – one of the most widely deployed server models on the market.

The ProLiant DL385 Gen10 is a two-socket system, supporting up to 64 processor cores, backed by up to 4TB memory and with support for up to 24 NVMe flash drives, making it a good platform for operating VMs at a low operating cost, according to HPE.

Overall, AMD’s EPYC platform is beginning to gain some traction among customers, not just with enterprises but also large cloud providers. However, AMD has a long way to go to regain the 25% server market share it held in the heyday of the Opteron, if it ever does so.

Nevertheless, the EPYC chips are providing customers with alternative choices to Intel Xeon servers, for which there has long been a lack of real competition. With some large hyperscale customers also beginning to show interest in ARM-based systems using chips such as Qualcomm’s Centriq 2400, the datacentre server market is beginning to get interesting again.

Read more on Datacentre capacity planning