In-Network Computing: DOCOMO & NTT demo low-latency AI video analytics 

This is Mobile World Congress (MWC)… and despite claims of Catalan backlash in the face of overtourism and, further, despite the two-hour queues to get through immigration at Barcelona-El Prat airport, the Computer Weekly Developer Network team is on the ground and ready to get in 11 miles a day walking around the Gran Fira to soak up the atmosphere.

DOCOMO i NTT demostren l’anàlisi de vídeo amb IA de baixa latència mitjançant computació a la xarxa amb recursos de GPU remots… would be the headline, if we stayed true to local parlance and ran the news in Catalan.

But we’ll stay in English, so the first alert that hit us this week was news of DOCOMO (hereafter Docomo) & NTT demonstrating low-latency AI video analytics using in-network computing with remote GPU resources.

Tech terms defined

Let’s define what that means.

NTT is of course the parent company to Docomo, which in itself is known as NTT Japan’s leading mobile operator, providing telecommunications, so-called smart life services (payments, insurance, healthcare and streaming content) and 5G connectivity.

Low-latency AI video analytics is the use of high-speed data and information network technologies (often used in tandem with a proportion of edge computing) to process live video materials at near-instant performance.

In-network computing sees a computer system processing data directly within its network hardware boundaries to reduce latency… and (finally, in terms of exposition) remote GPU resources sees the system use off-site GPU’s to power the compute element via a network.

In-Network Computing (INC)

Into the news, NTT Docomo, INC. and NTT, Inc. have successfully demonstrated low-latency AI video analysis using In-Network Computing (INC) with INC Edge, which connects remotely distributed GPU resources and 5G networks via IOWN APN.

In this demonstration, AI inference was controlled directly from the network through INC Edge implemented on the 5G core network. This setup enabled video data sent from devices to be analysed with minimal delay using remotely connected GPU resources via IOWN APN.

The firms say they will continue testing and standardising INC technology to support the use of simplified devices… and, ultimately, they aim to realise a network that enables AI and robots to unlock their potential in the 6G era.

6G with immersive XR

In the 6G era, new services leveraging immersive XR (Extended Reality), AI and robotics are expected to emerge. For clarification, XR denotes technology that merges the physical and digital worlds to create a sense of presence.

“These services often require high-volume, low-latency data transfer and large-scale data processing. For example, an autonomous robot may need to capture surrounding video and sensor data, analyse obstacles with AI… and provide real-time feedback for navigation. For applications running AI inference on small robots or lightweight wearable devices, maintaining a seamless user experience requires not only device-side processing but also the ability to process large volumes of data in real time outside the device. 6G networks are thus expected to handle both communication and service data processing to ensure service quality,” said the company, in a press statement.

Traditionally, distributed AI inference has been controlled by applications or servers, with networks mainly serving as data transport. This made service quality highly dependent on the location of GPU resources and network latency, often requiring nearby computational resources to minimise delays, limiting flexible resource usage.

To address these challenges, Docomo and NTT have been developing INC as a key technology for 6G networks. INC distributes computing resources, including GPUs, across the network and controls both communication and service computation centrally, enabling high-quality delivery of AI and other advanced services.

Demonstration time

In this demonstration, distributed remote GPU resources were connected to the 5G network via INC Edge over the IOWN APN to test AI inference processing. Video data sent from devices was integrally controlled with communication processing through INC Edge, then transmitted to remote GPU resources via IOWN APN and processed by AI video analysis applications.

Normally, distributed AI inference assumes GPUs are located nearby, as communication delays between distant resources can significantly slow overall processing. This demonstration tested whether INC Edge, controlling AI inference in-network alongside communication, could maintain high inference performance even using geographically distant GPU resources.

For this demonstration, the INC Edge was newly implemented with two key network functions: a feature to connect the IOWN APN with the mobile network and a mechanism to split AI inference into pre-processing and execution stages. Pre-processed data was transmitted and distributed to remote GPUs with low latency using network functions implemented on INC Edge. 

Priority control in Docomo’s commercial 5G core network on AWS ensured high-bandwidth, low-latency transmission. Combining these capabilities with INC Edge enabled fast, in-network AI video analysis. In this demonstration, the combined end-to-end latency of communication and AI video analysis was confirmed to be within the latency requirements assumed for autonomous robot operation in close proximity to humans. These results indicate that sufficiently low latency can be achieved to enable remote robot control in the 6G era.

The results of this demonstration are expected to be applicable to data transmission and processing for AI and robotics in the 6G era. 

The road ahead

Going forward, Docomo and NTT will continue advancing INC as a core 6G network technology by promoting further research, validation and international standardisation. 

Through integration of communication and data processing, they aim to support the widespread adoption of simplified devices and realise a network that enables AI and robots to unlock their potential in the 6G era.