Benchmarking real data governance

Data governance is hotting up. But what is the reality? Andy Hayler discusses the findings from a benchmarking survey from his firm and the Data Governance Institute.

Data governance is an area that has shown a dramatic upsurge in activity over the last year or so. One conference that I spoke at in the United States dedicated to data governance almost tripled its attendees in 2010 over 2009.

It is increasingly dawning on companies that an effective data governance programme is key to the success of master data management (MDM) projects, which are fast emerging from the territory of pioneers into the mainstream. Master data management is about the management of shared data in an organisation (such as customer, product, asset location, and so on), and so it can be seen that any MDM programme will quickly hit issues of resolving competing definitions of data as used by different parts of an organisation.

IT departments do not typically have the authority to get business departments to change their data or processes. Everyone supports the idea of standardisation, but only if it means other people standardising on their existing definitions and practice, not if it involves them changing what they are doing. Consequently conflict about who owns “customer” and “location” is almost inevitable, so an effective process needs to be put into place in order to resolve such conflicts.

Data governance is data light

Much has already been written about data governance, but given the relative immaturity of this area it is not surprising that what is published out there is heavy on “expert” opinion but light on actual data. In reality, what do data governance programmes look like, what do they cost and how effective are they? It is a lot tougher to get answers to these questions than it is to get a smooth talking consultant with a PowerPoint slide deck to tell you how it should be done!

In order to get a better grip on this area my company teamed up with the Data Governance Institute , a US organisation with include arguably the leading guru in this area, Gwen Thomas. We devised a detailed framework for data governance that would be suitable for measurement of progress, and validated this with a panel of multi-national companies who were well recognised for their existing and extensive data governance programs. After much tweaking, the framework emerged and a detailed survey was built up around it, suitable only for companies that had an existing operational data governance programme.

It was a lot of work to gather the data to fully answer the survey according to our beta customers (up to a couple of days of effort), so we expected perhaps a couple of dozen companies to participate, the idea being that they would be able to compare their own programs with that of their peers.

In fact we had a surprisingly high participation in the benchmark exercise, with 134 organisations from a wide range of industries submitting benchmark data. This large sample size meant that we were able to conduct some proper statistical analysis on the characteristics of the behaviours of organisations that had (in their own opinion) successful programmes compared to those that had less successful ones.

The length of the survey meant that there was a wealth of data available for analysis, and some interesting conclusions have emerged.

Overall, 57% of the benchmark dataset consider their data governance programmes to have been at least partly successful, which is not bad for such a new area. The general structure is to have a core team of people to run the programme (mean size of two people, median of four) with a mean of nine (median four) part-time data stewards.

The cost of setting up a data governance programme clearly varies dramatically based on organisation size, but the mean cost of doing this is $3.5 million (mean cost of $250k, reflecting the wide variation in size of the respondents) with ongoing annual costs of $1.2 million (median $200k).

Only 11% of companies had tried to measure the monetary benefits of their programmes, which is disappointing, but then only 54% make any attempt to measure the cost of poor quality data. A scary 58% of respondents admit to not having an effective register of business risks, including 68% in the banking sector, which is troubling given the recent issues in this sector. Moreover only 20% of respondents are confident about who can update their critical business data.

One key message is that it is vital to develop clear and documented processes to resolve data disputes. Only 16% had a fully effective process, and this factor had a high and statistically significant correlation with the effectiveness (or otherwise) of the overall programme. Indeed, those who had effective programmes could be characterised as having an active business risk register, good logical models for key data domains, undertook data quality assessment on a regular basis, had developed a business case for the data governance programme and had significant training activity associated with it, amongst other things. It is worth noting that there is little technology so far to support data governance, with 72% of the benchmark database having deployed no specific technology to help them, with 6% developing their own tools.

An article of this length cannot do full justice to the findings of the benchmarking survey (the full report runs to over 60 pages), but at least there is now a reasonably complete database of data governance activity out there, and one which will hopefully grow significantly over time as more organisations get their data governance programs operational and add their experiences to the database. Organisations can use the detailed benchmark against their peers in order to evaluate and justify their own data governance programmes, allowing them to see in detail how well their programmes are doing compared to a reference set of others.

In my experience a good data governance programme is almost a pre-requisite for an effective master data management project, which given the amounts being expended now on the latter means that data governance will soon be coming of age.

Andy Hayler is one of the world’s foremost experts on master data management. Andy started his career with Esso as a database administrator and, among other things invented a “decompiler” for ADF, enabling a dramatic improvement in support efforts in this area. He became the youngest ever IT manager for Esso Exploration before moving to Shell. At Shell, he set up a global information management consultancy business which he grew from scratch to 300 staff. Andy was architect of a global master data and data warehouse project for Shell downstream which attained $140m of annual business benefits.

Andy founded Kalido, which under his leadership was the fastest growing business intelligence vendor in the world in 2001. Andy was the only European named in Red Herring’s “Top 10 Innovators of 2002”. Kalido was a pioneer in modern data warehousing and master data management.

He is now co-founder and CEO of The Information Difference, a boutique analyst and market research firm, advising corporations, venture capital firms and software companies. He is a regular keynote speaker at international conferences on master data management, data governance and data quality.

Andy has a BSc in Mathematics from Nottingham University. He is also a respected restaurant critic and author ( Andy has an award-winning blog He can be contacted at: [email protected]

Read more on Data quality management and governance