CW500: Getting to grips with big data

Experts from Standard Life, Deloitte and Intelligent Business Strategies share some useful tips for CIOs on understanding big data

Organisations risk pushing ahead with big data projects without a clear understanding of what they will deliver, CIOs at Computer Weekly’s 500 Club heard.

Big data technologies promise businesses the ability to analyse hitherto unimaginable quantities of information, for nuggets that could give them a real competitive edge.

Devices are emerging that scour Twitter feeds to help companies assess what their customers think, analyse terabytes of phone records, or assess the data delivered by thousands of industrial sensors in an oil refinery.

Financial services companies are turning to big data to drive algorithmic trading and to help them meet “know-your-customer” regulations.

Energy companies are interesting in big data technology for smart metering. And telecoms firms are interested in analysing the GPS data and billing data of their customers.

The rate at which data is coming at us is beyond the ability of enterprises to consume

Mike Ferguson, Intelligent Business Strategies

“Even the sleepy old insurance industry down in Fenchurch Street has woken up to the fact that we can do something different here,” big data specialist Mike Ferguson, managing director of Intelligent Business Strategies, told the group.

Analysing masses of data

Yet many organisations are pushing ahead with big data projects without a clear idea of what they want to do with big data, the meeting heard.

Organisations have a tendency to invest in big data technology first, and work out what to do with it later, said Harvey Lewis, research director at Deloitte.

The rate at which data is coming at us is beyond the ability of enterprises to consumeMike Ferguson, Intelligent Business Strategies

“A very large multinational organisation just kicked off a big data programme. When we asked why and what it hoped to achieve, it told us it was doing it because everyone was talking about big data, so it felt it should be doing something too,” he said.

The drivers for investing in big data are compelling. Companies realise that if they can make even a 1% or 3% improvement on margins from analysing the data they already collect, the payback will be huge, said Ferguson.

“People want to be able to look at everything – not just samples of data, but the entire data set; not just every transaction, but every single interaction,” he said.

One of the biggest challenges will be developing ways for organisations to analyse data generated outside the organisation, such as feeds from Twitter or web content.

Big data in numbers

1.8 zettabytes volume of data created by humans in 2011, equivalent to a stack of books reaching from the Earth to Pluto 10 times

42 terabytes volume of data generated by Standard Life in 2011

300 billion number of e-mails sent worldwide each day

$5bn size of the big data market in 2012 (expected to reach £15bn by 2015)

1 billion number of items posted on Facebook in a day

1.7 million number of tweets on Twitter each day

85% proportion of the world’s data managed by organisations

70% proportion of the world's data created by people in their personal life

Sources: Deloitte, IDC, Standard Life, Wikibon

“The rate at which data is coming at us is beyond the ability of enterprises to consume. We have got to be able to put in filters to allow us to pull out quickly what is of business value,” said Ferguson.

Data quality will become increasingly important, as organisations seek to extract more value from their information, but there is still some way to go.

“If you look at the availability of software to address the data quality problem in the big data world, I think you would be sorely disappointed,” said Ferguson.

Big data has been the realm for small specialist suppliers. But mainstream suppliers, such as Oracle, IBM and EMC, are developing specialist appliances, based around the open source Hadoop platform, to help companies analyse and filter unstructured data.

“In my opinion, the enterprise data warehouse is dead,” he said. “We are now into multiple analytical stores, some of which are optimised for very specific problems, for data base or social network analysis. Then we have the unstructured and semi-structured data being handled in a Hadoop environment.”

Building a business case for big data

Developing a business case for big data, however, remains a challenge, with many organisations wary, Hannelie Gilmour, group chief architect of Standard Life, told the meeting.

“A lot of companies wasted a lot of money in the data warehouse era. So there is a question around business confidence in spending a lot of money around data. It is difficult to really get to the value that big data is creating for an organisation,” she said.

Image goes hereHarvey Lewis, Deloitte Analytics

Harvey Lewis, research director for Deloitte Analytics, advises businesses to think carefully about what they want to achieve before embarking on big data projects.

He warns against what he calls the reverse Hadron collider effect – throwing large volumes of data together in the hope they yield fundamental truth.

“We are mashing together bigger and bigger data sets in the hope that, through their collision, data scientists will be able to explain the virtual universe and everything in it. But this sort of experimental approach to big data strikes me as folly,” he said.

Too often, he added, organisations forget that big data analytics is a means to an end, not an end in itself.

It is as if John F Kennedy appealed to the American people in 1960 with a project to build the world’s most powerful liquid-fuelled rocket, rather than to send people to the moon.

Harvey advises businesses to start small, by looking at say 2% of their data, before trying to analyse 100%.

“Put bluntly, the message for CIOs is that big data does not automatically deliver big value. CIOs have got to get close to the business to understand the organisation's strategic objectives and to play a role in achieving them,” he said.

Sensitive data requires careful management

As big data becomes more prevalent, businesses will need to consider how they use data. According to IDC, 75% of the world’s data is created by and belongs to individuals, but 85% of it is managed by organisations.

This will place an increasing duty of care on organisations that retain and analyse data.

“The more we talk about data and the more citizens and customers see and feel the impact of big data on their lives, the more they will grow aware [of its use]. And once aware, they will have something to say about it,” he said.

There is a real risk of a public backlash if data is lost or misused, or, as in the Tom Cruise film Minority Report, used in an intrusive way.

“If, for instance, by using more granular data about individuals, we could predict who would be most likely to commit a crime,” he said, "do we arrest them before they have committed a crime? What about when we get our predictions wrong?”

The suggestion is not as far-fetched as it sounds. The US Department of Health, for instance, is sponsoring a competition on the website, which invites entrants to develop models to predict which patients are most likely to be admitted to hospital in the next 12 months, using anonymised data.

The programme could have enormous implications for patient’s medical insurance premiums, according to Lewis.

“Big data is only going to get bigger and more complex. But we shouldn’t become fixated on the bigness, we should think carefully about the questions we want to ask big data and the direction we are heading for. And we should definitely think about the people represented by the data – for it is their digital lives we hold in our hands,” he said.

CASE STUDY: Standard Life – exploring the potential of big data

Standard Life, the Edinburgh-based life and savings company, is looking at ways to harness the vast quantities of data it collects each year.

Image goes hereHannelie Gilmour, Standard Life

Last year, the company’s data mountain reached 42 terabytes, 75% of it unstructured. The figure is doubling in size every three years, according to Hannelie Gilmour, group chief architect at Standard Life.

The company is exploiting only a fraction of the data it collects. A survey last year showed that only 39% of the data stored over a six-month period had been accessed.

But new big data technologies will help Standard Life squeeze more value from the data it already collects.

“We are looking at areas such as true modelling of risk, so really trying to find the business and commercial value of risk,” she said, speaking to CIOs at a meeting of Computer Weekly’s 500 Club.

Gilmour is keen to use some of this data to find ways to encourage customers to stay with Standard Life.

Standard Life may only hear from customers once a year when they receive their statement, so finding a way to predict the factors that cause customers to change insurers is important.

The company is looking at sentiment analysis – analysing voice patterns when people call contact centres and what they say on Twitter – for feedback on the company’s performance.

Should we care about big data? If you don’t – your competition will

Hannelie Gilmour, Standard Life

Fraud detection is another possibility.

“We are looking at some of the capabilities. We can look across applications, data sets, and security systems and logs,”

There are two technical challenges to overcome, said Gilmour: “On the one hand is low-cost storage – how do you collect, how do you store and how do you back up data and archiving strategies for this huge volume and variety and velocity of data? 

"On the other hand, how do you best use analytics, what data may be stored in the archive in the most efficient way, and how do we best make use of data within our organisation?”

She added that the company is barely scratching the surface of the data it is collecting.

Standard Life’s work is still at an exploratory stage. It will be some time before big data technology has established a stable track record.

“In our industry you have to have robust stable technologies. A lot of the technologies we are using have not been proven. The question is how you integrate that back into your organisation,” she said.

In answer to the question, should we care about big data? Gilmour warned: "If you don’t care, your competition will.”

Read more on IT suppliers

Data Center
Data Management