Kong engineering VP: Deeper, down (and in-between) AI workflows

This is a guest post for the Computer Weekly Developer Network written by Saju Pillai, SVP of engineering at Kong.

Kong (the company, not the mythical giant subtropical herbivorous mammal) is known for its open source API gateway platform for managing, monitoring and securing APIs and microservices to form a central connectivity layer between clients and typically cloud-native API-based applications.

Pillai writes in full as follows…

Early enterprise AI adoption was characterised by small-scale deployments, limited integrations and a focus on one or a few flagship use cases. That relative simplicity is now gone. Today, enterprises are running multiple models in parallel. 

General-purpose Large Language Models (LLMs) for broad tasks, specialised domain models for things like legal or medical use cases and open-source models for experimentation or cost optimisation. That’s because no single model can meet every business need. Different workloads demand different engines and flexibility has become the new baseline for AI adoption.

But that flexibility comes with a price. Each API behaves differently. Each provider introduces new data governance considerations. Latency, cost and accuracy vary wildly between environments. The result is a web of fragmented workflows that place new demands on developers, data engineers and compliance teams alike.

It’s no longer enough to run one model well – businesses have to orchestrate multiple models responsibly and effectively. As multi-model workflows become the new normal, this orchestration has emerged as the defining factor in whether AI delivers lasting value or crumbles under the weight of its own complexity.

The multi-model movement

Given the pace of AI growth, it’s become clear that relying on a single AI model or provider was temporary. Innovation in AI has sent boardrooms and decision-makers into overdrive as they consider the productivity-enhancing and cost-saving potential of the technology. And that has naturally led to the deployment of AI into departments where it had been much less common, such as HR and legal. But just because one AI model works effectively in an area of the business doesn’t mean it will work well in every area. 

Different models are optimised for different things – some excel at summarisation and translation, others at reasoning over structured data, others still at generating code or analysing images. Enterprises quickly realised that a portfolio of models delivers better results than a one-size-fits-all approach. Cost and regulatory considerations reinforce this pattern: organisations want the ability to route workloads to lower-cost providers, switch to open-source for sensitive data, or comply with regional data residency requirements. A single-model strategy in this context is as limiting as it is fragile. 

Multi-model usage in practice isn’t just common – it’s widespread.

For developers, that reality translates into a growing web of APIs, prompts and pipelines that need to interoperate. Each integration adds new complexity, from authentication and monitoring to version drift and error handling. Multi-model adoption is now the ground truth of enterprise AI. The question is how operators will orchestrate them without creating bottlenecks, security gaps, or runaway costs?

Multiple AI systems risks

Running multiple AI models in production brings undeniable advantages, including flexibility, specialisation and resilience.

However, it can also multiply the attack surface and operational risks. Each model comes with its own training data lineage, prompting behaviour and integration quirks, which can create blind spots in monitoring and governance. Without unified oversight, it becomes difficult to trace how data is flowing between systems, whether sensitive information is being inadvertently exposed, or how decision logic is being applied across models. For regulated industries in particular, this lack of observability is fast becoming a compliance liability.

Then there’s the challenge of consistency. When different models interpret similar prompts differently, the outputs can diverge in tone, accuracy, or even bias. In a customer service workflow, that could mean one chatbot offering an approved response while another unintentionally violates company policy. In financial services, it could mean discrepancies in how transactions are flagged for fraud review. And in healthcare, inconsistent responses could impact patient trust.

Left unaddressed, these risks undermine the very efficiency and innovation gains that drove multi-model adoption in the first place.

The connective layer

If multi-model adoption is inevitable, organisations need a way to coordinate, monitor and enforce policies across diverse AI models in real time. Technologies, like a Gateway, can help ensure that requests move through the right model at the right moment, that outputs are logged for auditability and that data handling rules, such as anonymizing personally identifiable information (PII), are applied consistently. In other words, organisations need that connective tissue between otherwise isolated AI systems to transform a sprawl of models into a coherent workflow.

The absence of that connective infrastructure often leads to chaos – a state where integrations are built point-to-point, governance becomes reactive and performance issues are only discovered after they’ve impacted the user. By contrast, a well-connected environment creates a single layer of control. This makes it possible to implement safeguards such as prompt compression to reduce token usage, content filtering to prevent policy violations and intelligent routing to match workloads to the most efficient or cost-effective model.

For enterprises already wrestling with model sprawl, having technology to monitor and govern in one place is the only way to scale responsibly.

A gateway to a new era 

As enterprises experiment with an expanding mix of general-purpose, open-source and domain-specific AI models, a new architectural component is starting to take shape: the AI gateway. Much like API gateways transformed how services communicate in microservice environments, AI gateways offer a standardised entry point for requests moving between models. This control extends beyond just the prompts sent to LLMs to also govern how AI agents discover and use tools through emerging standards like the Model Context Protocol (MCP). Instead of every integration being custom-built and every risk being managed in isolation, gateways provide a consistent layer of control, policy and visibility.

In effect, AI gateways do for multi-model workflows what earlier generations of middleware did for APIs: they create order out of potential chaos, making scale possible without sacrificing governance.

According to Kong’s research, more than 40% of businesses are ensuring AI governance with an AI gateway. This solves several pain points at once. On the policy side, gateways can sanitise sensitive information before it ever reaches a model, enforce token limits to control costs and compress prompts to reduce latency. On the orchestration side, they enable intelligent routing, directing workloads to the most appropriate model based on cost, speed, or accuracy requirements. And on the observability side, gateways create a central point for logging, tracing and monitoring, giving enterprises a real-time view of how their models are performing and interacting.