Some headaches just never let up. When SearchCIO.com asked two business intelligence experts to talk about new BI challenges in these days of big data and cloud computing, two familiar albatrosses -- data quality issues and cost -- hovered at the top of the list.
"Regardless of big data, old data, new data, little data, probably the biggest challenge in BI is the data quality," said Bill Hostmann, a distinguished analyst at Gartner Inc.
BI veteran Boris Evelson considered the question from the vantage point of the decades he's spent in the field. "Data quality today is just as bad as 30 years ago," said Evelson, principal analyst at Forrester Research Inc. The barrier to fixing the problem of data quality is simple economics: "The hurdle is always expense," he said.
Regardless of big data, old data, new data, little data, probably the biggest challenge in BI is the data quality.
CIOs today might have a better understanding of data quality issues, and certainly there are more tools to help with data quality. Some argue that cloud-based BI products could help with cost, but our experts question the cloud's relevance to data quality. The never-ending challenge in data quality is that information is always changing: New systems come online, and new sources of data are tapped for BI. With big data, that can mean session logs, data from sensors, clickstream data and, as Hostmann put it, the "digital exhaust of social media."
In addition, data is not the only thing that's changing. As BI shifts from an effort controlled by the IT department to an activity practiced by users at all levels of the enterprise, what defines the quality of data also is a moving target. For CIOs, the question should be this: What quality of data is good enough for the task at hand?
"It doesn't matter what they, as the providers of information, think data quality is," Hostmann said. "What matters is the level of satisfaction of the person who is using that information to analyze or make decisions on the data. Their perspective is what counts."
What data quality means depends on who is asking the question, when they need an answer and how much they are willing to pay to get it. What people are realizing more and more is that a "single version of the truth" is an impossible dream, Evelson said. "It is totally relative and it is totally contextual."
Data quality satisfaction, surveyed and monitored quarterly
So, if data quality is relative and getting data to a point where it is acceptable for use remains a BI imperative, what should be the CIO's plan of attack?
Hostmann advises his clients to establish data quality metrics by surveying their enterprises' "thinkers and deciders" on data quality issues related to their BI programs. Gartner uses a simple survey tool that measures users' level of satisfaction with their BI data on several fronts (timeliness, relevancy, accuracy and consistency) as well as their ability to use the data to make business decisions. The results should be tracked quarterly because the definition of data quality, again, is always changing.
Getting the right business sponsor
The results of such a survey almost certainly will vary from business unit to business unit and turn up hot spots all over the enterprise. Here the question is this: Which data quality problems have the greatest effect on business strategy and objectives? "Now you are back to economics," Hostmann said. Deciding what to fix first, however, is not a job for the technologist. "That is something a business sponsor has to give you guidance on, another big challenge."
"IT does not own the data," Evelson said. Unlike almost any other enterprise application, BI by definition requires business ownership. The IT group can create a system to sort out the way a company defines a customer as complex as IBM, for example; but as to whether it's worth getting to the bottom of that relationship -- "Well, it is certainly not IT's job," he said. "That is a board-level decision."
'Precision' BI users and 'clueless' BI users require different tools
Broadly speaking, CIOs who survey their enterprises about data quality satisfaction will find themselves dealing with two types of BI consumer: Hostmann calls them "precision users" and "the clueless" (those who "don't know what they don't know"). Data discovery tools are a good and relatively inexpensive way to give clueless users a view of potential data relationships. "You can bring tons of information into memory spaces without building multidimensional databases, and let people explore," he said.
More about business intelligence strategies
Precision users include people in finance departments or in industries that are heavily regulated. They know what they are looking for and can identify data quality problems. Fixing those problems is a negotiation, however. Fixing multiple definitions of the term customer, for example, might require reengineering the system of record -- and that again raises the issue of cost. "They have a valid point, but how much are they willing to spend to fix that?" Hostmann asked.
Even when the cost is justified, don't expect data quality issues to evaporate, Evelson warns. CIOs can be sure that CFOs, chief marketing officers and vice presidents of sales will have very different definitions of customer profitability, depending on their compensation packages. "And IT gets caught in the middle," he said. To repeat: Some headaches just never let up.
Let us know what you think about the story; email Linda Tucci, Senior News Writer.