Data ethics: Author warns of ethical pitfalls of data collection

Heedless data collection in business analytics programmes is breeding moral hazards, according to author Frank Buytendijk.

In breaching unknown territory there is a good chance you will do something wrong. As in immoral, unethical or just plain bad.

Frank Buytendijk, the author of a new book on business intelligence (BI)  and analytics, and chief marketing officer at business process software firm Be Informed is out to instill fear in attendees of The Data Warehousing Institute's (TDWI's) first London Symposium, which takes place Sept. 10-12. "I want them to be scared, but in a good way," he told in a preconference interview.

For more on data ethics

Find out about the ethics of data mining

Deloitte sees a focus on ethics as a 2012 analytics trend

BI ethics in telecoms explored

While efficiency and effectiveness are usually at the heart of the business case for an analytics programme, ethics should be, too, Buytendijk argued. And just because an organisation can piece together a customer’s life from their data trail does not mean it always should.

Buytendijk, who is giving the keynote speech at the conference, will parlay his book, Socrates Reloaded: The Case for Ethics in Business & Technology, which offers 15 essays on how philosophers would view business IT -- "how Aristotle would have defined an enterprise architecture; how Plato would have viewed IT governance; how Harvard’s Michael Sandel would treat enterprise 2.0 initiatives"; and how Karl Marx "predicted the end of Google and Facebook."

How did he do that? On Buytendijk’s account, Google and Facebook are as hungry for data as Victorian capitalists were for capital. They want to collect as much data as they can for the benefit of advertisers, not for the producers of [that data]. People alienate their data in the way Marx says we alienate our labour power. Moreover, our love of Google and Facebook’s "free" services are the counterpart of the religious "opium of the people. "

But -- and here comes the dialectic -- in the networked world we can "leverage the same community-based business model of the Internet giants to overthrow them." Go somewhere else and they crumble, just as cool hangouts become uncool when the hip people go elsewhere, he added.

Buytendijk said there are two schools of ethical thought to be aware of: the "consequentialist" school, which judges people and actions by their consequences; and the "universalist" school, which judges by intentions. Consequentialists tolerate white lies; universalists don't. IT people need to do a bit of both. Be aware of the consequences of data collection and think hard about the purposes of it up front.

He also aligns the "consequentialist" tradition of ethics with such business strategy-as-emergent thinkers as Henry Mintzberg, and the "universalist", intentionalist school of ethics with such strategy-as-planning proponents as Michael Porter. It's the "go with the flow" bottom-up approach to strategy contrasted with the five-year-plan, top-down approach. Again, you need both, according to Buytendijk.

And so he propounds a pragmatic, context-based synthesis that invokes rules, but is sensitive to the specific needs of individuals or companies. Progress in technology, he said, is seldom predetermined, as universalists might suppose: It emerges. A technology example is text messaging, which was originally designed for engineers to check lines, not for us to text each other.

But that does not mean we should laud the market as amoral, and be laissez-faire, he continued. In the realm of business analytics, "big data" approaches that say, "let's gather all we can, then see what we can do with it" are breeding moral hazards.

"The old data warehousing model set its own constraints," he said. "You build a model of the business, load it with data and derive BI reports." Those borders have disappeared, and privacy issues are abounding.

Buytendijk cited the Dutch TomTom example, where the satellite navigation company aggregated and sold customer data to local governments and infrastructure authorities. The Dutch police then obtained the data and used it to locate speed traps. In April 2011 TomTom chief executive Harold Goddijn apologised for having sold the anonymous data, generated by customers who had subscribed to a live traffic feed, in the belief that it would be used to improve safety or relieve traffic bottlenecks.

"You need to compare the use of data with the original intention of measuring that data," concluded Buytendijk.

Another example of data getting out of hand, said Buytendijk, is data integration projects in the public sector, resulting in identity theft victims being persistently arrested. "This is costing people their businesses, marriages and lives," he said.

Finally, data mining technologies are apt to answer questions that were never asked, he warned.

"You cannot undo knowledge, and data discovery technologies can tell you stuff you just don't want to know" -- such as this, hypothetically: old people are more likely to commit fraud than others. Be afraid.

Read more on Master data management (MDM) and integration