- stock.adobe.c

How Dataiku is supporting ‘everyday AI’

Dataiku president Krish Venkataraman outlines what it takes for enterprises to scale and govern their use of artificial intelligence while making the technology accessible across their business

Founded in 2013, at a time when artificial intelligence (AI) wasn’t as hyped up as it is today, Dataiku has been on a mission to democratise access to data to support better decision-making for enterprises.

Now, it hopes to do the same for AI, building on the capabilities it has honed over a decade in data governance and DataOps to support “everyday AI” deployments in the enterprise. In September 2023, it launched LLM Mesh, a “backbone” platform that helps with the governance, cost management and security of generative AI (GenAI) applications.

In an interview with Computer Weekly in Singapore, Dataiku’s president, Krish Venkataraman, and its vice-president and general manager for Asia-Pacific, JY Pook, talked up the company’s edge in the market for machine learning operations (MLOps) and DataOps tooling, how it works with partners like Amazon Web Services (AWS) and the outsized growth opportunities in Asia.

There’s a lot of noise around AI, MLOps and DataOps with different players, including hyperscalers, data platforms and even chipmakers, eyeing a slice of the pie. How is Dataiku standing out?

Venkataraman: Like any new capability, there will always be initial players that play between components. AI is now following the same theme that we see with any emerging technology, where there are lots of participants and then everybody figures out where they have the best ability to innovate, sell capabilities and expand.

Today. there’s a lot of focus on compute by companies like Nvidia and others. There’s also a huge focus on large language models (LLMs) because they are, in essence, the unit device that the iPhone was. In the next few years, there will be a greater focus on the application layer that takes advantage of the compute and LLMs. And with that, enterprises will need the ability to orchestrate and govern the entire environment, and more importantly, give users access to their own data. That’s the space we’ve excelled in for a long time. Over 10% of the global 2000 companies are our customers. It’s a big achievement, because those companies, which tend to be highly regulated and complex, require products that scale and provide a level of governance and security.

Underpinning our success are three things we have solved, starting with the acceleration of digitisation through generative AI that will give every employee, not just data scientists, access to data and AI capabilities. The second thing is the modernisation of the data stack, but in a different way. With AI agents having the ability to provide people with access to different sets of their own data, it’s no longer just about having a centralised data warehouse or data lake. The third thing, especially in the enterprise, is governance. How you govern your environment and distribute AI capabilities, is key. But it’s not just about regulatory compliance, it’s also understanding who has access to what capabilities in a way that accelerates the use of AI rather than slows it down.

That is what enterprise AI is about. If you don’t have those three things solved, it doesn’t matter how much compute you buy. It doesn’t matter how many LLMs you buy. You really can’t drive value from your data without those three streams working together.

It appears that the frameworks around AI governance are still evolving, unlike areas such as software development where you have things like a software bill of materials (SBOMs) that tells you what goes into a software application. There’s nothing like that for AI models where you’d know what went into an AI model. What are your thoughts on that?

There are places where we’re selling to data analysts and ‘everyday AI’ communities – that’s where we excel through our ability to provide the same statistical capability and tooling to a person who may not have a PhD in statistics
Krish Venkataraman, Dataiku

Venkataraman: If you look at technology governance, over time, a tremendous number of frameworks have been developed, whether in cyber security or data management. We’re seeing some developments in AI governance frameworks that revolve around the concepts of fairness, accuracy and transparency. Our fundamental expectation is that for governance to work, it has to move from a “no” to a “yes” model, and that requires us to build a framework, as well as capabilities like LLM Mesh where governance is seamless and transparent.

Governance also helps with cost management because that’s part of the governance environment and gives users the ability to understand governance without thinking of governance as a “no” factor. That’s what we are putting a lot of investment into because we feel that managing a very complicated LLM environment is going to be key for enterprises. For that to happen, we need to build a foundation layer now, not think about governance a year or two from now, before it becomes really complicated and stops innovation. If we don’t solve for governance, the democratisation of AI will not be possible.

What Dataiku did with cost management in LLM Mesh was interesting. The hyperscalers have different prices for LLM application programming interfaces (APIs) in different markets – how did you navigate the AI ecosystem to make sure  you can reflect the actual cost of an organisation’s AI deployments?

Venkataraman: Most enterprises don’t have a golden ticket where they can say that their environment is so well laid out that everything magically happens every day. Most of them have a complicated data environment and compute structure. A lot of companies are talking about cloud, but most regulated ones are still living on-premises and in hybrid environments. In those environments, how can we be a centralised player that helps orchestrate things in a way that’s valuable for everybody in the ecosystem? It’s not purely about cost – it’s also about the combination of the right components that creates the most value for data users. If it’s only about cost and results, you’d tend to create an environment that doesn’t provide the same level of transparency for the user.

The other thing is that a lot of the decision-making right now tends to happen at the data scientist or data engineer level. For data to be used by everybody, we need to provide that decision capability for the rest of the organisation. That can’t happen without creating a unified layer like a mesh. What we have is the first iteration of LLM Mesh today, but as we build it up with input from our clients, you should see a lot more value creation and use cases.

Dataiku has a broad set of capabilities in its the portfolio. What’s the typical entry point for most of your customers?

Venkataraman: I asked the same question when I first joined the company. The answer is it varies, because unlike other data or AI companies that tend to take a vertical approach, we’ve always had a unique cross-functional approach. We want to be a platform for everybody, whether you are in finance, marketing or health sciences. That means our starting point tends to be users who have a clear understanding of data, have some ability to do some basic coding, and have a good statistical foundation to understand the data. That group of people sits in every part of the organisation.

Our fundamental expectation is that for governance to work, it has to move from a no to a yes model and that requires us to build a framework, as well as capabilities like LLM Mesh where governance is seamless and transparent
Krish Venkataraman, Dataiku

So, we may hit a tech company that’s building a foundation for AI and using our product in risk management. It could also be a large financial institution using us to help their financial advisors create bespoke portfolios for every client. Or it could be a large pharma company that needs to understand how to effectively position their drugs to the right audience at the right place and at the right time.

In essence, our use cases are not limited to just one small subset, and that’s valuable to our customers because when they are buying tooling today, they want it to solve not just one unique problem or address specific use cases, but also the ability to use it in every functional area of their business. That’s where I think we’ve done a very good job of landing at business units with a major problem, and then quickly expanding based on word of mouth to other parts of our clients’ business. We spend little to no money on broad-scale marketing – instead we spend a lot of time with our clients so that they can champion our platform to other organisations.

Dataiku also partners with the hyperscalers. How are the dynamics of those partnerships playing out given that the hyperscalers offer similar MLOps tooling?

Venkataraman: I think we have a phenomenal relationship with AWS, as well as Snowflake and Databricks. The compute doesn’t sit on Dataiku, so we are pushing compute to those partners. The more we succeed with our clients, the more value they create, and in software, there will always be this grey layer between application providers and foundational layers. I think we’ve done a pretty good job to manage that, otherwise I don’t think we would have been recognised as AI partner of the year by AWS, Snowflake and Databricks.

And by the way, there are cases where we are not that valuable, and in some cases they are not valuable. It depends on who we’re selling to. We tend to partner if it’s selling to a data scientist who does not want a combination of coding and no-code tools. Then, there are places where we’re selling to data analysts and “everyday AI” communities – that’s where we excel through our ability to provide the same statistical capability and tooling to a person who may not have a PhD in statistics. That’s the market where we have most users.

What sorts of investments has Dataiku made in the Asia-Pacific region and what’s next moving forward?

Venkataraman: We already have large boots on the ground in Singapore, Japan and Australia. For us, it’s just doubling and tripling down on those investments. In fact, half our business should come from Asia in the future, because this is such a massive market and great organisations across the globe will come from here.

The other thing I can tell you is that governments in Asia are at the forefront of AI investment. Singapore has always done a phenomenal job when it comes to governments doing the right thing and setting things up correctly, but now you can see that throughout Asia. In fact, in a lot of cases, governments are ahead of enterprises in AI investment.

Which industries are you most focused on in the region?

Pook: Our platform is pretty horizontal so different industries and companies of different sizes can use it. The question is, which industries and companies are at the forefront of trying to leverage AI? You see that in banks, insurance companies and governments, which Krish talked about. There’s also retail, pharmaceutical and manufacturing. For example, there was an air-conditioner manufacturer that deployed sensors to collect data about the temperature in different parts of a room and combined that data with the data from the air-conditioners to optimise product performance. And so, we have most traction in industries that have a lot of data and are trying to use the data to make smarter decisions and achieve better outcomes every day.

Read more about AI in APAC

  • DBS Bank’s AI Industrialisation Programme has been instrumental is industrialising the use of data and AI across its business, resulting in over S$370m of incremental economic benefits.
  • Alibaba’s SeaLLMs are built to address the linguistic diversity and nuances in Southeast Asia, enabling businesses to deploy localised chatbots and translation applications.
  • Malaysian startup Aerodyne is running its drone platform on AWS to expand its footprint globally and support a variety of use cases, from agriculture seeding to cellular tower maintenance.
  • The Australian government is experimenting with AI use cases in a safe environment while it figures out ways to harness the technology to benefit citizens and businesses.

Read more on Artificial intelligence, automation and robotics

Data Center
Data Management