Earlier this year, McKinsey released a report called The data-driven enterprise of 2025, which illustrated the journey that organisations have to go on to reach the ideal of being able to make intelligent, informed decisions based on facts. It’s a great idea and for some businesses probably achievable, but the reality could be very different if organisations don’t get a grip on data quality issues. If businesses persist with poor-quality data, decision-making will at best revert back to old-school tactics. So much for the data-driven enterprise.
It is reminiscent of what CB Insights co-founder Anand Sawal said a few years ago when, talking about decision-making, the rise of data analytics and the need for business leaders to find a sweet spot between data and human experience, he said: “We joke with our clients that too often, these big strategic decisions rely on the three Gs – Google searches, Guys with MBAs, and Gut instinct.” Of course, we’ve moved on since then, haven’t we? Haven’t we?
Not quite. As some recent research by enterprise intelligence firm Quantexa revealed, 95% of European organisations are “crippled by the data decision gap”, where “inaccurate and incomplete datasets” are undermining organisations’ ability to make accurate and trusted decisions. Also, research by marketing analytics platform Adverity found that 63% of chief marketing officers (CMOs) make decisions based on data, but 41% of marketing data analysts are “struggling to trust their data”.
The Adverity report suggests there is misplaced optimism among marketing departments, with two-thirds identifying as “analytically mature” and yet, for 68%, data reports are spreadsheet-based. Manual data wrangling is a big challenge that many organisations are having to contend with. As the report says: “The number of manual processes the dataset goes through must be called into question. If data is being transferred from Facebook and LinkedIn to Excel and then into PowerPoint, this creates more cracks for human error to seep in.”
Chris Hyde, global head of data solutions at data management firm Validity, cites a customer example where high volumes of duplicate records were creating data distrust and an increased workload. Akamai Technologies manually verified data, made updates and merged duplicates on a daily basis, he says, leading to an overhaul of the customer relationship management (CRM) system to enable easier access to data management tools.
Vishal Marria, Quantexa
This sort of thing is only made worse by organisations amassing larger data volumes from internal and external sources, but not joining the data points. As Vishal Marria, founder and CEO of Quantexa, points out, this problem is exacerbated as business leaders look to increase growth via mergers and acquisitions, for example, and in the process inherit additional data silos into an existing fragmented data cluster.
“Data is only useful if it is managed in the right way, and legacy technologies – which are typically rules-based and reliant on batch processing – are falling short,” says Marria.
The Covid-19 pandemic has only made things worse. Validity’s Hyde cites a statistic from his firm’s The state of CRM data health in 2022 ebook, in which 79% of respondents agree that data decay has increased as a result of the pandemic. He says a lot of this has to do with employees transitioning into new roles and, in turn, their phone numbers, addresses and job titles changing with them. Also, with more remote working, office locations and addresses are becoming irrelevant.
All of this means that lead and contact information in the CRM is rapidly going stale, and team members who stay behind face growing workloads as their co-workers leave.
For many organisations, this inability to cope with change is symptomatic of poor data management plans and processes. Organisations cannot make decisions that are truly data-driven without a solid, leadership-backed approach to data management. Hyde says: “Although they may claim to, many companies are enabling unethical practices by not making data management a priority.”
Boiling the ocean
It’s a common theme. Are business leaders giving data management enough oxygen to ensure data is accurate? There is a sense that many organisations have been tick-boxing data quality and not really focusing efforts on the most business-critical data. However, there is evidence that this is changing. More firms are now realising that striving for 100% data accuracy is difficult and expensive.
Lori Witzel, director of research for analytics and data management at Tibco Software, says: “Historically, data quality has been a risk-and-cost-management technology driven by IT, with deduplication to save costs for storage and data movement, and accuracy to manage compliance for regulations like GDPR [General Data Protection Regulation] and COPPA [Children’s Online Privacy Protection Act]. But there is a new trend to democratise data quality so business stakeholders can focus project scope and self-serve, based on the importance of data quality for insights generation.”
Witzel says this shift accompanies the move away from data management teams “boiling the ocean”.
“Rather than seek perfect data quality organisation-wide, scope is tightened to just what is needed for important insights,” she adds. “If improving customer experience is high-value, you need a 360-degree view of customer engagement, with a data quality project to unify the different identifiers for a given customer.”
Witzel agrees it is difficult to build a 360-degree view of a customer without smart master data management to automatically detect and resolve mismatches.
“Once a ‘golden record’ is built, data virtualisation can then provide analytics workflows with the 360-degree customer view required to improve customer experience,” she says.
Of course, in reaching this point, we are making a few assumptions. Democratising data still demands a degree of data literacy across an organisation to really make it work. Given the shortage of data scientists (although how many firms use their data skills to conduct mundane tasks?), the requirement for departmental data training is increasing, but this takes time and is not really the root of the overall issue.
A lack of proactivity and leadership leads to poor data foundations, which is undermining attempts to maintain data, leading to a firefighting approach, with inevitable consequences.
Patrick Peinoit, principal product manager at Talend, supports the need for more proactivity (something that the UK government data quality site demands), saying that organisations need to check and measure data quality before the data gets into their systems.
“Accessing and monitoring data across internal, cloud, web and mobile applications is a huge undertaking,” he says. “The only way to scale that kind of monitoring across those types of systems is by embedding data-quality processes and controls throughout the entire data journey. This approach can also help to scale data quality as amounts and varieties of data increase.”
Like Tibco’s Witzel, Peinoit believes there is a need for better collaboration within organisations towards democratisation of data, but too many businesses are not gearing the data towards those that need it most. This collaboration issue between the lines of business and IT, as well as between the different lines of business, means that not only do they not have the right data, but they also don’t understand it and don’t always trust it.
According to a Talend survey released in 2021, only 40% of executives say they always trust the data they work with, supporting Peinoit’s claim for organisations to build more collaborative data-quality cultures. But how is this done? Peinoit suggests “taking a holistic approach to data management in order to properly manage the entire data life cycle”, as well as driving greater data awareness and understanding by centralising the data office function, for example.
“Data quality is a team sport,” he says. “It’s impossible for a single person or team to manage an entire organisation’s data successfully.”
One answer, Peinoit suggests, is self-service, which he says is a good way to scale data quality standards.
“Self-service applications like data preparation and data stewardship can allow anyone to access a dataset and cleanse, standardise, transform or enrich the data,” he explains. “Putting in place data-quality rules in order to bring business context in the detection and in the resolution of issues, or an overall data governance approach linked to metadata management, can also solve this issue.
“Intelligent capabilities or workflows using machine-learning technologies, for example, can help apply quality controls automatically – masking, access management, and so on – and scale data quality throughout the development of data-driven initiatives.”
Patrick Peinoit, Talend
Another answer is automation. Inevitably, machine learning will play a significant role as data volumes are increasing, but this will only work if datasets are not disjointed. Without a systems overhaul, it is almost inevitable that manual data input and quality checking will remain.
“The question of manual jobs versus automated ones is important,” says Quantexa’s Marria. “What is the best augmentation mix to achieve the best results?”
It’s a good question. There is no one-size-fits-all solution, but there needs to be an attitude shift, a desire to recognise what is going wrong and how to build for a data-driven future.
Marria adds: “For a large organisation with vast volumes of data, manual work is extremely demanding and often leads to inaccuracy and incompleteness in their datasets. But by automating elements of those resource-heavy tasks – such as transaction monitoring, claims investigations and know-your-customer processes – artificial intelligence [AI] and machine learning are able to extrapolate insights and create a single 360-degree view of the customer data.”
Certainly, McKinsey sees AI as having an increasingly pivotal role in ensuring good-data quality. In a report last December, 40% of businesses said they are using AI for this purpose, with 45% using AI for data governance. That is to be expected, but AI is not a silver bullet for data-quality issues and indeed decision-making.
It will take much more than that – for example, leadership, culture, collaboration, data literacy and experience – to ensure the data foundations are laid correctly, and consistently, and that decisions are made with more than one finger in the air.