Four leading executives from US big data and analytics companies tell a similar story of imminent increased big...
data analytics adoption in the UK and Europe. Senior executives from MapR, Cloudera, ParAccel, and Pentaho traced out the likely pattern of growth in 2013 in a series of interviews with Computer Weekly.
Corporate users of IT have been getting their hands dirty with big data technologies, such as Hadoop, in 2012. Apache Hadoop is the open source instance of the parallel programming framework MapReduce, developed at Google. Hadoop simplifies data processing across huge data sets distributed across commodity hardware. It is one technology associated with big data, which includes social media data, machine-generated data and data types that do not fit neatly into the rows and columns of relational database technologies.
Leading thinker from the data warehousing field, Teradata's chief technology officer Stephen Brobst, predicted, at the beginning of this year, that 2012 would be when big data would "cross the chasm". Still in the post-innovators and early adopter phase, but moving from interactive digital companies like Google, Facebook, Twitter and LinkedIn through financial services into telecoms. And also moving from the US west coast to the east, and thence to the UK.
Asked to comment on how they see those pilot projects developing, or failing to develop, and where they see big data in relation to "crossing the chasm", Jack Norris, vice-president marketing at MapR; Kirk Dunn, chief operating officer at Cloudera; Rich Ghiossi, vice-president of marketing at ParAccel; and Quentin Gallivan, chairman and chief executive officer at Pentaho, had similar, but differing responses.
Read more on the big data adoption curve
MapR, a customised Hadoop distribution company co-founded by Google alumnus MC Srivas and John Schroeder in 2004, launched its European operation on 6 December 2012, with its headquarters in London. Jack Norris, vice-president, marketing at the company, said customer and partner demand has driven the timing of the post-start up company's European launch.
He confirmed that his company's experience was that UK organisations are going straight to production, and that there had been more experimentation evident among US prospects and customers.
He said he saw a range of emerging applications where Hadoop is being used creatively. "There are, firstly, the web 2.0 properties in digital. I liked this summer's story that Rubicon has now passed Google in the reach of their advertising network, based on the ComScore measurement. Rubicon happens to be a MapR customer, and so is ComScore, and Google is a partner," said Norris.
"The other end of the spectrum is a major US credit card issuer that rolled out a new service based on Hadoop in one quarter. We also have a UK financial services company doing something similar.
"But we've also seen Hadoop being used to understand sensor information on a global basis to schedule preventative maintenance. Examples are a semi-conductor company and a server manufacturer. And it is being used in government intelligence, and at an internet security company. There, instead of doing sample data analysis for fraud detection, you are looking at all of the data, and looking for precursors to predict fraud: prevention rather than just detection."
Cloudera is another Silicon Valley Hadoop distributor, though it wraps its services around the open source version. It is also stepping up its European operation, confirmed chief operating officer Kirk Dunn,who said, with respect to the phasing of big data analytics, that "with a lot of technology the early phase is about evangelism". But with Hadoop for big data analytics companies are "already feeling pain" in terms of data size and speed. He said the traditional adoption curve was less applicable, that adoption was proving faster and more in parallel across sectors like digital, financial services, telecom and government, than in sequence.
"Once you have this technology, you can see other possibilities and the open source nature of Hadoop is a wrinkle here. That means adoption is organic, bottom up as well as top down, from CIO level." Dunn recounted a story of how a "crafty" IT professional had taken advantage of a server refresh at a financial services company to download Cloudera's Hadoop distribution, unbeknown to the CIO, who had been cogitating about big data. Cloudera, he said, did some match-making, and introduced the two men to each other.
His advice for those beyond pilot-stage experimentation was plain. "Don't go trying to solve any new business problems. Look at your top two or three business imperatives and apply customer-generated big data capability to those. Trying to find some esoteric result from an esoteric technology is not to be recommended. Don't do that.
"The social networking companies have taught us that there is a level of intimacy we can get with who we are connected to. By the same token, enterprises can connect their products and services with customers in a more intimate way, which is like the social networking entities.
"Guy Chiarello, CIO at JPMorgan Chase, said the bank wants to understand customers so it gets more share of their wallets in a way which benefits those customers. It's the degree of customer insight that big data analytics makes possible that enables that. We are now able to do things that we were not previously able to do because of storage limitations and lack of compute power".
ParAccel, an advanced analytics database company based out of Santa Cruz, is also increasing its activity in the UK. Vice-president of marketing Rich Ghiossi stressed that big data is only part of its picture, which affiliates more with what Gartner calls the logical data warehouse, spanning data stores large and small, non-relational and relational from the locus on "an analytical hub". Other analysts and analyst houses have their own terms for the concept, such as the hybrid data ecosystem.
ParAccel was founded in 2005 by Barry Zane, one of the founders of data warehouse appliance vendor Netezza, sold to IBM in 2010.
In relation to big data analytics adoption, Ghiossi said: "The coasts of the US seem to be ahead of London, but only by a few months." There is a sticking point, however. "For you to adopt Hadoop today you really need a slew of expensive programmers. So, the government and the new dotcom, digital companies have those: other sectors, less so."
Like Norris, Ghiossi identified sensor data as an emerging area for user organisations, as well as social networking data. "Companies using control systems are leveraging log data in ways that are new. Small variations in the sensor data from air conditioning or electrical systems can indicate where preventative maintenance is needed. And we are also seeing, in California, where smart metering in energy can generate data that can change consumer behaviour in ways that could be phenomenal".
Pentaho's CEO Quentin Gallivan also said the big data phenomenon is fuelling growth for the open source data integration and business intelligence supplier, founded in 2004, and based in Florida. It helps customers quickly ingest big data into their Hadoop, NoSQL, or analytical platform such as Teradata's AsterData and enables them to visualise and analyse that data. Gallivan was the CEO of AsterData when Teradata bought the company in 2010.
Customers tend to come to Pentaho once they have deployed Hadoop, he confirmed. "I do think the US is ahead, but we are seeing lots of use cases in the UK. And the industry analysts we've talked to in the UK are bullish. My sense is that if you are not an interactive company, in digital or government, you are more likely to be trialing [big data] technology, to be in an experimental phase."
A recent SearchDataManagementUK survey of 184 UK and continental European IT and business professionals critically engaged with data matters, revealed that 27% planned to increase investment in big data technologies in 2013. And 23% of respondents already had big data programmes, of whatever stage, in development.
Norris, Dunn and Ghiossi were in London to attend a Big Data Analytics event, organised by Whitehall Media.