LLM series - Nutanix: Boxed for builders, how to forge & create developer AI tools

This is a guest post for the Computer Weekly Developer Network written by James Sturrock in his position as director of systems engineering, UK&I at Nutanix.

Sturrock writes as follows in full…

As we look out across the topographies that today describe modern enterprise software stacks, systems and solutions, we can clearly see that Artificial Intelligence (AI) – whether that be predictive, reactive or generative AI – has now become a core mandate.

With generative AI coming to the fore and now promising to bring accelerating automation advantages to every conceivable application we use, it may appear to be a case of boiling the ocean, drinking from a firehose and moving heaven and earth for IT departments that have yet to really embark on this journey. It can certainly seem like a daunting task, so where should we start?

Firstly, let’s remind ourselves how AI should be implemented.

AI use cases

Organisations in all industries now have the opportunity to recognise how AI can be used to amplify business security and reliability; many firms will find that AI can be applied to enhance data protection and Disaster Recovery (DR) as key cornerstones.

We can also remind ourselves how broad the application of AI can (and arguably should) be. This is because AI isn’t confined to one location, a large proportion of firms should think about how to boost their AI edge computing strategy investment as it exists in the Internet of Things (IoT).

But for AI to be truly effective, an organisation needs to be able to evidence a commensurate level of data management competency. In this age of cloud, it makes sense to identify platforms that can deliver integrated enterprise data services and security to protect the company’s AI models and data. It’s no good having bad data i.e. poorly parsed, deduplicated, unstructured or unsecured, but EVEN WHEN an organisation has good data, it’s no good if the business information streams are offline – this is a platform-level progression and it should be regarded as such.

A before AI

Getting AI running in an already-operational business is a team effort. This means working with both data scientists and developers (plus of course business stakeholders) to bridge the gap between their worlds and be able to use AI via platforms that deliver cloud-like operations to enable seamless access to workloads and data…

Nutanix’s Sturrock: Focus on key foundational structures at the infrastructure level for AI foundation models.

… and, as a secondary but no less fundamental point here, if we had to add another A before the AI in AI, it would be accessibility, adaptability and application applicability. This means that AI-empowered applications should be accessible and adaptable from the start. Organisations need to be able to buy, build, or modify AI models and data from any source; they should then be able to run them anywhere that their business requires the injection of AI to manifest itself.

But as much as we have said thus far (and to further validate our A for accessibility) we must also remember that using generative AI should not require a fully-blown subscription to a Cloud Services Provider (CSP) hyperscaler.

Our own CEO Rajiv Ramaswami reminds us that because many IT teams are still in a prototyping phase with AI, they are almost ‘playing’ with it in terms of discovering how its benefits can be captured, harnessed and indeed monetized. But we know that AI requires solid infrastructure and there is already a shortage of GPUs. It may be easy enough to build an AI model in the public cloud, but implementation is much more cost-effective in a hybrid cloud.

AI to go, carry out

There’s a packaging process happening here.

You could call it packaged generative AI, you could call it a compartmentalised intelligence services framework (if you wanted to make heads spin and over-jargonise it)… or you could just call it GPT-in-a-box – spoiler alert, we did call it that and it means that when software application developers are looking at the massive learning curve they face to gain competency with LLM data science techniques, there are shortcuts which help validate (there’s that word again, sorry) the use of these technologies in live production environments. This is a turnkey software-defined solution designed to seamlessly integrate generative AI and AI/ML applications into an organisation while keeping data and applications under your control.

When firms look at the consolidation factor that they can achieve in modern IT stacks, they can think about bringing together unified storage for data and their MLOps stack in one place.

As our own, VP of system engineering Paulo Pereira has previously explained, “Companies see the potential of generative AI and want to get the same results. But they wonder whether they want to bring their data to the cloud. Your internal data becomes part of the public domain. But models by themselves are worthless without data. GPT-in-a-box uses open-source models and trains them with private data. Users thus retain control over their data because it stays within the organisation. Companies see the potential of generative AI and want to get the same results. But models are worthless without your data.”

Underlying mechanics for developer AI

The goal of this discussion is to consider the environments required for developers to be able to work with Large Language Models (LLMs) and for that engineering to result in AI & ML models that are functional, bias-free, cost effective, safe and secure. For any of that to happen, software teams need to look at the underlying mechanics and supporting platforms that they execute these workloads with, on, through and over. That means looking for cloud substrate services that can cope with the model training and the resulting generative AI that is delivered.

We presume that a deep understanding of pre-trained model training leads to better decision-making for LLM system design, including model sizing, dataset sizing and compute infrastructure provisioning. Some customers might need to train a pre-trained model from the ground up because their datasets are very different. The compute infrastructure provisioning for LLM pre-training can be a complex endeavour, especially when we start to think about concerns related to data privacy, data sovereignty and data governance.

Once again, this is a platform play consideration and developers need to think about the back-end before they build the front-end.

MLOps is tops

As we move ahead in this space and start to enable an increasing number of tools for generative AI software development, we are already working to ensure we can balance the infrastructure services beneath.

Due to significant advancements in machine learning over the past few years, AI models can now be applied to a wide range of complex tasks. We will need to embrace MLOps as a key methodology to underpin the lifecycle of models and ensure that MLOps consists of workflows for the different pieces of the lifecycle including training, inference, hyperparameter tuning etc.

There is complexity here, but there is also key foundation support at the infrastructure level for the foundation models that we now drive into our LLM-empowered generative AI development. It’s time to look ahead, but make sure we have our ‘feet’ planted on the right infrastructure and cloud platform first.