Live, open government data – collated on third-party platforms – can change peoples’ lives profoundly in a wide variety of ways.
According to the UK’s Data.gov website: "The government has made releasing open data a priority because: it makes the government more accountable to citizens and strengthens our democracy (for example, the Department for International Development’s Aid Tracker); it brings us better public services (for example, The Guardian’s GCSE Schools Guide); and it feeds economic and social growth (for example, transport data intermediary Placr)."
The government says it leads the world in open data – but just what does this really mean? As it backtracks to explain its recent move to gather NHS patient data from primary care sources, it is a timely question.
The government generally makes data available in an extracted spreadsheet format. This raw data can be used by anyone – so long as they have the skills to operate Microsoft Excel (or another spreadsheet with built-in analytics), or have the luck and skill to be able to use a business intelligence tool.
More articles on open data
However, spreadsheet data is not live data – it will always be a snapshot in time, taken when the data was extracted from the systems that created it. It will also be a single set of data. Downloading multiple sets of data from various sources – and cleansing and matching the data to give a more homologous and meaningful dataset to work against – is well beyond the capabilities of all but a privileged few data scientists.
However, it has its place and it is promising to see government making a lot of different datasets available for general use. The Guardian – the second of the government’s examples quoted in the introduction above – leads its field in opening public data for the general citizen. Its Open Platform provides apps and interfaces so data can be viewed in an easy and meaningful manner by anyone, and for which no data analytics skills are required.
What is really required is for live data to be opened to the public. This means being able to read (not write or edit, for obvious reasons) the data from the live sources, and for these sources to be linked as necessary so the average person can make sense of them.
The government has so far created applications that sit on top of the data, which give a glimpse of the information – but not in a particularly open manner. For example, the NHS’s Choose and Book application allows a patient to book the hospital they want for an operation, based on live data on when operating slots are available with suitable capabilities. This is, however, a closed application on which to identify and book a specific operation – you cannot run any analytics against the data.
The examples cited above are also relatively closed applications. Aid Tracker allows you to see where international aid is going, and you can drill down to see the projects the money is nominally spent on, but any further examination of the data is impossible.
However, Placr is one example of how government can move towards a true open data environment. Placr is a digital platform put together to provide a single UK source of transport information, and has created apps based on datasets from elsewhere. One such example is its UK Traveloptions, for which Placr harnesses Transport for London’s (TfL) live data to show how bus and underground train services are running at any point, as well as BusMapper London, which allows for live interactive travel planning. Placr is a data intermediary – it acts as the means of providing access to live data in a procedural and relatively simple manner. And how does it do it? In the age-old manner of using an application programming interface (API).
An API provides anyone from outside (or inside) the civil service the means to access data in a secure and efficient manner. The datasets behind the API can change their schema; the application logic behind the owning application can change. The API should ensure there remains backwards and forward compatibility in how outside systems or applications talk to the system supplying the data.
The Guardian is making its APIs available to others - Apigee, Mashery, WSO2 and IBM are among the API management systems suppliers that make multiple applications and data systems work more easily together. By using APIs effectively, it is possible to use the equivalent of an enterprise service bus across various different systems, so to aggregate data sources for different reasons. Does this mean that the government will have created a completely open data environment? It is unlikely. Those who select the data for publication will still choose what is made available, but it will have created a means for third parties to put together a greater number of more intelligent apps.
For example, let us take the NHS Choose and Book platform. Through the use of open APIs across multiple different systems, the patient would not only be able to find which hospitals have available theatre times for their operation, but also see whether public transport could get them there and back easily and on time, and whether local amenities were open and easy to get to while they are recuperating.
UK flood response mechanism
Another example would be how to respond to the recent UK floods. Research reported in the journal Nature Climate Change predicts that the spate of floods in south England and elsewhere is set to repeat and worsen to the extent that, by 2050, the economic cost of flooding in the European Union will rise from its present £4bn a year to almost £20bn.
But the use of open APIs could create an app that matches real-time flood data with where emergency services are and what their availability is; and the availability of supplies such as sandbags and sand. This can be reported along with the capability to log damage data and photographs with insurance companies in a manner that meets the insurance company’s criteria – without the need to wait until the insurer has decreed that the floods have retreated sufficiently to send out an assessor.
As the internet of things (IoT) becomes more of a reality, the use of connected sensors in areas such as flooding, storm and other weather events means government will have to deal with increasing types and volumes of data. However, the value of this data will only be maximised if it can be used in as many different ways as possible. The need for the government to mandate open APIs across every new application should be a high priority.
Private sector partnerships
What should be recognised here is that a government that wants to be seen as the leading proponent of open data needs a number of partners in the private sector. Not only to create the apps, but also to offer their own datasets so the result covers end-to-end needs. Government also needs to push those organisations where it has the biggest touch points – for example, in transport and private health – to move towards open, standardised APIs as well. Even if private organisations charge a small fee to access data, the overall benefit could be huge.
As cloud computing takes hold, there will be a lot more callable functions to provide capabilities that can be pulled together as a composite application. This will introduce opportunities for open data apps to add value through availability and access to different datasets. With the government driving a cloud-first approach through its G-Cloud/CloudStore/Public Services Network platforms, it could become a major supplier of cloud-based data services to the private sector and to consumer app creators.
Overall, open data is not really about providing massive datasets to the general public. What it is really about is enabling government and third parties to create apps that can access different datasets across a broad number of different sources and make sense of them in a way that helps the end user. This can only be done effectively through the use of open APIs.