Nvidia takes chatbots a step closer to conversational AI

New graphics processing unit platform, running on Azure, is set to make Bing a little bit cleverer

A new artificial intelligence (AI)-powered speech recognition from Nvidia is set to power the voice assistant search in Microsoft Bing and a new generation of chatbots. The technology could lead to the development of chatbot systems that can respond more like a real human.

Nvidia claimed the new system could power chatbots that operate more realistically than existing AI systems. To achieve this, it said the new Nvidia platform has been optimised to run queries on vast datasets. 

According to Nvidia, it is extremely difficult for chatbots, intelligent personal assistants and search engines to operate with human-level comprehension because of the inability to deploy extremely large AI models in real time. To overcome this limitation, Nvidia said it had added key optimisations to its AI platform, which it said could deliver complete AI inference in just over two milliseconds.

“Large language models are revolutionising AI for natural language,” said Bryan Catanzaro, vice-president of applied deep learning research at Nvidia. “They are helping us solve exceptionally difficult language problems, bringing us closer to the goal of truly conversational AI.”

Rangan Majumder, group program manager at Microsoft Bing, said: “In close collaboration with Nvidia, Bing has further optimised the inferencing of the popular natural language model Bert using Nvidia GPUs, part of Azure AI infrastructure, which led to the largest improvement in ranking search quality Bing has deployed in the last year.

“We achieved two times the latency reduction and five times throughput improvement during inference using Azure Nvidia GPUs compared with a CPU-based platform, enabling Bing to offer a more relevant, cost-effective, real-time search experience for all our customers globally.”

Using Nvidia’s T4 GPUs running its TensorRT library, the new platform performed inference on the Bert-Base SQuAD dataset in only 2.2 milliseconds – which is under the 10-millisecond processing threshold for many real-time applications, said Nvidia.

Read more about AI acceleration

  • Latest forecasts suggest spending on artificial intelligence is ramping up, and organisations that need raw machine learning performance are turning to custom hardware.
  • The Amax BrainMax Series servers feature the latest Nvidia RTX GPU accelerator cards for intensive GPU and AI applications and deep learning initiatives.

Read more on Artificial intelligence, automation and robotics

Data Center
Data Management