Improving data quality

For most of my career I’ve been concerned about the poor quality of most of our databases. It’s been bad in most organisations I’ve encountered, and it’s likely to get progressively worse with increasing centralisation and recycling of data. Joseph Juran, the famous quality expert, estimated the cost of poor quality data to be around 20-40% of sales turnover. In the public sector, the consequences of bad data can be highly damaging to the individuals affected. Standards would help, as would better discipline and tools at the point of capture.

Recently I was talking to George Barron and Ross Miller of Unified Software who operate an Internet based service, called BankVal, for verifying bank account information at the point of entry. They reckon that typically 8% or more of non-verified bank account details are incorrect and the average repair cost is £35 per record. The check itself costs pennies, so there are savings to be made from checks like this, not to mention the added security benefits.  

George tells me that in general 96% accuracy is about as good as may be expected from data capture, but under certain conditions, because of operator tiredness or interruptions for example, the accuracy can be far lower than this. A further problem is that banking data changes surprisingly quickly. The UK banking database, for example, comprises around 20,000 records, and in any given month, around 300-400 of these records change (a rate of change of 1.5-2% per month). An unmaintained banking database can therefore become seriously inaccurate within a short space of time.  

We need to encourage greater attention and priority to the issue of poor data quality. If the data management community cannot raise the subject higher up the management agenda, then perhaps it’s time for security managers to add their weight to the issue.