What do businesses really look for in open data?

This is a guest blog by Harvey Lewis, Deloitte


The value of an idea lies in the using of it.” Thomas A. Edison, American Inventor.


In 2015, the UK’s primary open data portal, www.data.gov.uk, will be six years old. The portal hosts approximately 20,000 official data sets from central government departments and their agencies, local authorities and other public sector bodies across the country. Just over half of these data sets are available as open data under the Open Government Licence (OGL). Data.gov.uk forms part of an international network of over three hundred open data efforts that have seen not just thousands but millions of data sets worldwide becoming freely available for personal or commercial use. [See http://datacatalogs.org and www.quandle.com].

Reading the latest studies that highlight the global economic potential of open data, such as that sponsored by the Omidyar Network, you get a sense that a critical mass has finally been achieved and the use of open data is set for explosive growth.

These data sets include the traditional ‘workhorses’, like census data, published by the Office for National Statistics, which provides essential demographic information to policy makers, planners and businesses.  There are many examples of more obscure data sets, such as that covering the exposure of burrowing mammals to Radon Rn-222 in Northwest England, published by the Centre for Ecology and Hydrology.   

Although I’m not ruling out the possibility there may yet be a business in treating rabbits affected by radiation poisoning, simply publishing open data does not guarantee that a business will use it. This is particularly true in large organisations that struggle to maximise use of their own data, let alone be aware of the Government’s broader open data agenda. The Government’s efforts to stimulate greater business use of open data can actually be damaged by a well-intentioned but poorly targeted approach to opening up public sector information – an approach that may also leave more difficult-to-publish but still commercially and economically important data sets closed.

But is business use predicated on whether these data sets are open or not? And what is the impact on economic success?

Businesses would obviously prefer external data to be published under a genuinely open licence, such as the OGL.  The data is free for commercial use with no restrictions other than the requirement to share alike or to attribute the data to the publisher. However, if businesses are building new products or services, or relying on the data to inform their strategy, a number of characteristics other than just openness become critical in determining success:

·         Provenance – what is the source of the data and how it was collected? Is it authoritative?

·         Completeness and accuracy – are the examples and features of the data present and correct, and, if not, is the quality understood and documented?

·         Consistency – is the data published in a consistent, easy-to-access format and are any changes documented?

·         Timeliness – is the data available when it is needed for the time periods needed?

·         Richness – does the data contain a level of detail sufficient to answer our questions?

·         Guarantees of availability – will the data continue to be made available in the future?

If these characteristics cannot be guaranteed in open data or are unavailable except under a commercial licence then many businesses would prefer to pay to get them. While some public sector bodies – particularly the Trading Funds – have, over the years, established strong connections with business users of their data and understand their needs implicitly, the Open Data Institute is the first to cement these characteristics into a formal certification scheme for publishers of open data.

A campaign is needed to get publishers to adopt these certificates and to recognise that, economically at least, they are as important as Sir Tim Berners-Lee’s five-star scale for linked open data.  For example, although spending data may achieve a three- or even a four-star rating in the UK, not all central government departments publish in a timely manner, in a consistent format or at the same level of richness, and some local authority spending data is missing completely. These kinds of deficiencies, which are shared by many other open data sets, are inhibiting innovation and business take-up, yet are not necessarily penalised by the current set of performance indicators used to measure success.  

It’s time for open data to step up. If it is to be taken seriously by businesses then the same standards they expect to see in commercially licensed data need to be exhibited in open data – and especially in the data sets that form part of the ‘core reference layer’ used to connect different data sets together.

Publishing is just the first and, arguably, the easiest step in the process. The public sector’s challenge is to engage with businesses to improve awareness of open data, to understand business needs and harness every company’s constructive comments to improve the data iteratively. We may have proven that sunlight is the best disinfectant for public sector information, but understanding and working with business users of open data is the best way of producing a pure and usable source in the first place. 


Harvey Lewis is the research director for Deloitte Analytics and a member of the Public Sector Transparency Board.