NTT brings Large Action Model to “market”

NTT, Inc. used its recent NTT R&D Forum in Tokyo to detail a new AI technology called the Large Action Model (LAM).

What is a LAM?

The is an AI model that predicts customers’ intent based on time-series data organised in the “4W1H” format (Who, When, Where, What and How) when collected from various customer touchpoints, including online channels and physical stores. 

Distinct from a Large Language Model (LLM), which – as we know – generally works on understanding, interpreting and generating human language text, a LAM’s core function is to “translate human input” into concrete steps inside a a given environment or system. 

In practice, LAMs often serve as the foundation for AI agents.

In terms of reasoning and planning, LAMs often integrate sophisticated planning and logic powers in order to determine the optimal sequence of actions needed to achieve a user’s ultimate goal.

This technology enables “highly personalised 1-to-1 marketing” actions tailored to each customer’s needs. 

Numerical & categorical data

LAM is a generative AI technology specialised for time-series data that includes both numerical and categorical data, possessing a structure similar to large language models (LLMs).

Obvious enough perhaps, numerical data is information relating to a core value and categorical data is used to classify information in various groups, not quite the same as Meta data and database parsing, but in the same ballpark i..e. Categorical data holds information that details qualities, characteristics, or groups… its values are typically labels or names and so it can not actually be be used for meaningful arithmetic.

DOCOMO integration situation

NTT was responsible for the research, development and tuning of the model, while DOCOMO handled the integration of customer data, the construction of the LAM and the verification of the promotional effectiveness. As a result, the order rate for mobile and smart life-related services through telemarketing improved by up to 2 times compared with conventional methods.

Through design and parameter optimization, DOCOMO’s proprietary LAM was built using less than one day of computation, equivalent to approximately 145 GPU hours, on a GPU server equipped with eight NVIDIA A100 (40GB) units.

According to NTT, “As companies aim to enhance customer satisfaction and create new revenue opportunities, advancing marketing strategies has become a key challenge. Until recently, most companies relied on ‘segment marketing’, which groups customers based on attributes such as age or gender and provides tailored proposals to each group. In recent years, however, “1-to-1 marketing,” which offers personalised proposals for each individual customer, has been gaining attention, creating a need for more precise customer understanding.”

The company says that to effectively implement 1-to-1 marketing, it is essential to utilise sequential behavioural data obtained from various daily customer touchpoints and to understand customer needs based on the entire process leading up to product purchases or service subscriptions, known as the customer journey.

However, because the frequency and format of data differ across touchpoints, integrating and analysing time-series data has been technically challenging. For example, app usage generates high-frequency operational logs, while in-store data mainly consists of lower-frequency data such as purchased items and payment methods. Integrating these diverse datasets in a unified manner is difficult and when trying to further account for combinations and sequences of customer interactions, the complexity and computational cost of analysis increase significantly.

Meanwhile, NTT says it has been conducting research and development on an AI technology called LAM, which learns and predicts patterns of behavioural sequences in time-series data that includes both numerical and categorical data. 

Transformer-based model

This technology has an architecture similar to large language models (LLMs) and enables future behaviour prediction with a Transformer-based model i.e. a type of neural network architecture used in deep learning with particular relevance to Natural Language Processing (NLP) that employs a mechanism called self-attention.

In this collaboration inside the NTT Group, the two companies integrated their respective technologies. By using DOCOMO’s CX Analytics Platform to consolidate customer data into time-series form and applying NTT’s LAM with an optimised tuning method, they built DOCOMO’s proprietary LAM, achieving reductions in computational cost.