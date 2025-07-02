For Moe Abdula, Google Cloud’s vice-president of customer engineering for Asia-Pacific, the past year has seen a “drastic difference” in the artificial intelligence (AI) landscape, with customer conversations shifting from experimentation to tangible business value and return on investment (ROI).

“A year ago, people were saying, ‘everybody’s experimenting, but when are we going to get to production?’ Nobody’s asking that now,” he said in a recent interview with Computer Weekly. “We’re starting to see people build ROI and thinking about building AI by default.”

In doing so, the spotlight often falls on the latest AI models, but Abdula noted the importance of the underlying infrastructure that powers them. He pointed to two key pillars for Google: its partnership with Nvidia and the parallel development of its own specialised hardware.

At Google Cloud’s Next 2025 conference in April, the company announced it is bringing its Gemini models to on-premise, air-gapped environments with Nvidia Blackwell systems running on Google Distributed Cloud. The move is aimed at addressing the needs of sectors with stringent data sovereignty and security requirements, such as government, defence and regulated industries like healthcare and finance.

At the same time, Google has developed its own tensor processing unit (TPU) architecture, which Abdula dubbed as the “backbone of AI” for Google. While graphics processing units (GPUs) have traditionally been optimised for training AI models, TPUs are designed for inference – the process of using a trained model to make predictions – which is where the bulk of the cost and computational load lies for large-scale AI services.

The rapid adoption of AI is also forcing a rethink of underlying infrastructure standards, including Kubernetes. The open-source container orchestration platform, which has become the de facto standard for modern applications, now faces a juncture as it adapts to the unique demands of AI workloads.

“Do we sustain a dual architectural model, which is pods and so on, and APIs [application programming interfaces] that allow you to interface at the resource level for something a little bit lighter and more connected with the resource managers of AI systems? Or do we simply let go of the whole pod?” said Abdula, adding that Google does not have a formal position yet, as it monitors developments in the open-source community.

In the interim, Google Cloud is enhancing its Google Kubernetes Engine to better support AI. This includes introducing code libraries that simplify the deployment of workloads to TPUs instead of GPUs. Additionally, Google has added more granular resource management controls to help organisations efficiently share expensive GPU resources among different teams.

As infrastructure evolves and AI models proliferate, Abdula positioned Google’s Vertex AI as the central platform to help customers manage the escalating complexity. Vertex AI lets organisations access and manage models from Google, open-source providers and other commercial labs, simplifying everything from commercial agreements to model migration.

For enterprises, a key challenge is managing the rapid pace of model updates, such as the transition from Gemini 1.5 to 2.5. Vertex AI has tools to ease this process, including an evaluation service to ensure quality parity between model versions and even the ability to make a new model behave like its predecessor. For users struggling with prompt compatibility, Abdula said Google has “introduced a toolkit that allows you to do the migration of prompts”.

Abdula also touched on the emergence of agentic AI systems – AI that can reason, plan and automate complex workflows by interacting with data and other systems. “This is perhaps one of the areas that everybody is seeking to understand how to work through,” said Abdula.

Google Cloud was one of the first in the market to provide agentic AI development tools – its Vertex AI Agent Builder and the open-source Agent Development Kit (ADK) aim to simplify the creation of agents that can connect to any data source and interact with other agents.

While the productivity gains are promising, AI agents can bring governance challenges. A primary concern is how to manage the permissions and access rights of these agents as they become more integrated into core business processes.

“The idea of being able to deliver an agent is actually not that sophisticated or complex,” said Abdula. The real evolution, he said, is moving from simple task automation to agents that have reasoning capabilities with the right integrations and controls around what they’re authorised to do or not do.