AWS G4 aims to lower cost of GPU-powered AI inference

Amazon Web Services seeks to cut the cost of running artificial intelligence inference engines with the introduction of a Nvida V100 GPU-based instance

Cliff Saran, Managing Editor

Published: 23 Sep 2019 10:33

The Amazon Web Services (AWS) public cloud appears to have undercut its main competitors with the introduction of a new GPU (graphics processing unit) EC2 instance called G4, which uses the Nvida V100 GPU.

AWS said the new G4 instance is designed to help accelerate machine learning inference and graphics-intensive workloads, both of which are computationally demanding tasks that benefit from additional GPU acceleration. Application areas include adding metadata to an image, object detection, recommender systems, automated speech recognition, and language translation.

AWS said the G4 instance can be used both for training artificial intelligence (AI) algorithms and to run the inference engines for AI-powered applications. It said GPUs enable its customers to reduce machine learning training from days to hours, but the real cost in AI is running the AI algorithm.

“Inference is what actually accounts for the vast majority of machine learning’s cost,” it said. “According to customers, machine learning inference can represent up to 90% of overall operational costs for running machine learning workloads.”

On-demand pricing of the g4dn.xlarge, four virtual core instance, with one GPU and 16GB of memory, starts at $0.526 per hour. The eight virtual core instance, with 32GBs of RAM, costs $0.752.

By comparison, Google’s one GPU V100 instance, with 16GB of memory, is currently priced at $0.74 under Google “pre-emptive GPU price” usage model. Meanwhile, Microsoft’s NC6s v3 instance on Azure with 6 V100 cores and 112GB of memory costs £2.2807 per hour on-demand.

Using G4 for machine learning on the AWS public cloud is supported via Amazon SageMaker or AWS Deep Learning AMIs (Amazon machine images). AWS said it supports machine learning frameworks such as TensorFlow, TensorRT, MXNet, PyTorch, Caffe2, CNTK and Chainer.

The company plans to offer support for Amazon Elastic Inference on G4 instances, which it claimed would slash the cost of inference by up to 75%.

According to AWS, G4 instances can also be used for graphics-intensive workloads. Application areas include remote graphics workstations, video transcoding, photo-realistic design, and game streaming in the cloud. It said G4 offered up to a 1.8x increase in graphics performance and up to 2x video transcoding capability over the previous-generation G3 instances.

AWS G4 aims to lower cost of GPU-powered AI inference

Amazon Web Services seeks to cut the cost of running artificial intelligence inference engines with the introduction of a Nvida V100 GPU-based instance

Read more about machine learning acceleration

Read more on Artificial intelligence, automation and robotics

10 top AI hardware and chip-making companies in 2025

What is Nvidia?

Budget flexibility for on-prem AI

Use this EC2 instance type comparison to power your AWS apps