The power of systematically inaccurate information

We read much about the insecurity of government databases but little about the consequences of the inaccuracy of that which is secure. Few follow good practice in data validation. Those supplying data often have more interest in consistency than accuracy (lest change raise questions). Too many have a vested interest in systemic inaccuracy. 

Tom Steinberg’s work on the power of information contains valuable insights and I have just enjoyed enjoyed visiting the Free our data feedback forum (I voted for Adrian Norman’s PSIKey and added a comment on the need for us to be able to validate and control the data on ourselves, with third party oversight when that control is over-ridden in the public interest). 

A mash-up of crud should, however, be flushed away – not used as a basis for policy formation. 

We need to rediscover some old disciplines if we want the data used for policy formation, let alone that used to support service delivery, to be fit for purpose, as opposed to the consistent but misleading fictions that commony populate public databases of personal data or  organisational performance.

Some of the fictions are systemic: to help those with a vested interest in its inaccuracy – to claim benefits or subsidies (including by inflating local population estimates), or meet targets (e.g. claims of service delivery), Others are random: e.g. statistical returns or personal data submitted by those with no interest in whether it is accurate or not. 

Forty years ago I was taught that unless data is collected from those who have a vested interested in its accuracy it probably has a 30% per error rate. I was also told than unless it is used and validated regularly its accuracy degrades by at least 10% per annum.

Recent studies of criminal and medical records have shown that error rates of 30 – 50% are still common. An exercise last year by a credit reference agency indicated that half those for whom they had records no longer lived at the address given on the electoral register and half those on their records were not on the register at the same address. No wonder they had lost interest in having access to the registers of a “banana democracy”. 

This is not a new problem. In 1973 – 4 I was project manager for the ICL-DTI-DoE computing strategy study to aid the formation of the Regional Water Authorities. This included looking at the billing and rating systems of almost very major utility and local authority. Hereditaments (alias properties to be served) rarely changed (save that houses turned into flats and vice-versa) but the churn of the occupants ranged from 2.5% leafy suburbs to over 400% p.a. in city centres. The main difference today is that the people churn is even higher. Hence the growth of operations like Experian, with credit reference but one strand of secure and trusted information management – under governance routines that put others to shame.  

Data on businesses also has to be published and maintained if it to remain accurate. A recent exercise using BERR business database, showed it to be rather less accurate and comprehensive than Dun and Bradstreet. 

Biometrics may be a mature technology but human biometrics are unstable when it comes to the risk of false negatives – from watering eyes affecting iris scans to “wear and tear” changing the fingerprints, especially of the elderly. .

Meanwhile attempts to realise the “savings” from holding data once and once only, can ignore the reasons it is not the same on different files. When trying to merge part number master files in my first employer I learned that supposdly identical components had different mean times between failure. Fudging over such differences has been the cause of some very embarrassing recent product recalls.

It is not only politicians who need to rediscover the disciplines of information governance.

The Directors Forum on Information Governance organised by EURIM on 24th November is therefore likely to be most interesting. We are currently seeking advance discussion papers that cover not just what needs to happen but how to bring about the changes of political and regulatory processs that are necessary to identify and encourage good practice, not just tick-box compliance. The pressures of organisational survival in a post crash world will make this event even more timely than expected. Three days later, on the 27th, I am due to chair the morning session of “A fine balance“, a workshop on the potential of Privacy Enhancing Technologies organised, by the Cybersecurity Knowledge Transfer Network. I will be asking the organiser to suggest that participants look at the advance papers on the EURIM website to help put their technologies into the application context of today – not that of a world of trust in technology solutions that has evaporated.