Beware the AI black box

A guest blog by Matt Jones, Analytics Strategist at Tessella 

Big tech vendors have been piling into analytics and now AI. IBM’s Watson has been charming the media with disease diagnosis and Jeopardy prowess, and Palantir has been finding terrorists. Other major vendors such as Oracle and SAP have also been joining the scene, albeit with slightly less fanfare.

These companies have been leading tech innovation for years; one would expect them to offer a credible AI solution, and indeed their technology is good.

But the black box approach that many such companies currently offer presents problems. The first is that analytics is not a plug and play solution; it needs to be built with an understanding of data and context.

The second is that, in buying a black box solution, you lose control of your data. This means you are not sure what it’s telling you, you allow others to benefit from it for free, and you may not be able to access it in future.

And now you may even be in for a big bill for the privilege. A recent court case ruled that drinks company Diageo had to pay SAP additional licences for all customers that indirectly benefitted from SAP software used in their organisation, costs that could run to £55m.

If upheld, this could have a chilling effect on the analytics platform industry. Could every beneficiary of data insights across and beyond your organisation now need a licence? Companies should now think a lot harder before committing to a platform on which their whole business relies.

Your data is your company’s lifeblood – value it

Even before this case, it’s stunning how much control of data companies were willing to give up to vendors, and how little thought goes into the consequences of over reliance on one technology.

Vendor lock-in is nothing new of course, consumers have been allowing Apple and Google to track their every move, and global businesses putting everything in the Microsoft cloud, for years.

But data and AI represent this problem on steroids. Data projects done right are embedded throughout the entire organisation. Some solutions will suck in all your data from every system, lock it away, and even refuse you access to it. So these platforms have all your data, the context, and the insights it provides. And now they have even more power to charge you to benefit from it.

How to plan for an analytics solutions

So, how can you benefit from data analytics without storing up future problems?

Before you even think about technology, get the right mix of technical and business people to look at what your business needs to achieve and how data can help you. Then get people in with data science expertise who can explore how your data can be used to support those needs.

Only then should you look at what platforms you need to achieve this. The most powerful is not necessarily the most suitable, find one suited to your need. In doing so, consider licensing models and demands from your data – does it leave your site? Can you access it when you need to? Look at the company’s overall culture – are they transparent or opaque? This will guide you in how they are likely to handle your data.

Perhaps more important is to consider whether you need a black box at all. Google, Microsoft and Facebook, amongst others, all offer openly available Artificial Intelligence (AI) APIs on which anyone can build bespoke AI or machine learning platforms – which are as sophisticated as any black box on the market. Furthermore this allows you complete control and transparency of how the data is fed in, processed and presented, so you can identify causal links between data and outcomes, rather than having to trust someone else’s insights into your business are correct.

If you do need a black box solution – and there are times when they are the right option – you should ask whether the vendor is a partner or just a platform. Do they understand your business context? Do they integrate with your particular data setup? Do they leave you with control of your data? Do they make the data analysis process clear, so you can understand whether your business insight is based on a causal links or just an unsupported pattern spotted in the data.

The approach you take should be driven by the most appropriate approach to solving your challenge or finding the insight you require to make better decisions – not by the platform itself – and it should consider what level of data control and oversight you are willing to give up. Once you have properly defined that, then you can make the best decision about how to use your data to meet those goals.