Victoria - Fotolia

Cloudera users put pattern recognition at heart of new business ventures

Cloudera users Barclays, BT, Sky Betting & Gaming and Markerstudy are finding new business development opportunities through the use of the Hadoop ecosystem

A quartet of Cloudera customers have revealed some business projects, based on Hadoop-related technologies, on the eve of the London version of the Strata+Hadoop conference.

Data analytics professionals at Barclays, BT, Sky Betting & Gaming and insurance underwriting firm Markerstudy outlined a group of business development activities supported by Hadoop distributor Cloudera.

Harry Powell's eight-person advanced data analytics team at Barclays retail bank is using progamming language Scala, as well as Cloudera's query engine Impala and its distribution of the parallel processing framework Spark on transactional big data sets to seed products for customers.

One example is a "smart business" application for small businesses that enables them to see patterns in an anonymised data generated by the bank's other customers.

"So, if you are a hairdresser in Croydon, you can see what other hairdressers are spending on electricity," said Powell.

"We're looking to use information to provide services to customers as if they were hugely wealthy. W e can create value for individuals by using big data to spot patterns. This marks us out as a bank that is for customers, rather than just about our profits.

"We're doing this over billions of rows of data and with potentially thousands of queries," he said.

Powell also reported better results with Spark compared with Hive. "We can now do computation we could never do before, and provide information at a level of granularity previously unimaginable."


Phillip Radley, chief data architect at BT, said the telecoms company is in its fifth year of Hadoop use. BT has moved it on from beginnings in research and innovation labs to "being a first class citizen in the datacentre".

BT is using its Hadoop environment for "dozens of use cases" in its business. "We took a cautious approach to begin with, using it for ETL offload from data warehouses, and we're now moving on to support delivering services to customers, such as improving broadband by using data from electrical line testing."

Another area where the software is being used is in identifying nuisance calls, which Radley said is an example of making a product from data science research.

"Nuisance call analysis is something we've done in response to requests from Ofcom. We developed a model that can identify and route nuisance calls to a network-based answering machine."

Sky Betting & Gaming

Mark Pybus, engineering manager of big data at Sky Betting & Gaming, described how the data science and engineering teams have been using Spark and Scala, as well as older Hadoop stack tools, such as Hive, to identify customers who are -- or are about to become -- problem gamblers.

At 3pm on Saturdays, the company processes 200 bets per second, and it changes its prices continuously.

"We think we are at the forefront of using data to promote responsible gambling. We use machine learning against our Hadoop store to tell who is becoming an irresponsible gambler," said Pybus.

Sky Betting & Gaming will then customise offers to minimise the customer's vulnerability, and get a dedicated in-house responsible gambling team to contact them with advice and support.


Nick Turner, a consultant for motor insurance underwriter Markerstudy, said his firm has been using Hadoop for around a year to automate fraud detection. It receives 25 to 30 million quote requests each day, which all have to be "enriched with third-party credit scores, identity checking and fraud calculations, for example, and then priced".

Before the firm used Hadoop, the brokers relied on a sample of data, representing 3% to 5% of relevant data, said Turner. The brokers can then do finer analysis.

Similarly, the fraud teams can also focus on higher value cases since potential fraud indicators are being identified at the quote stage.

All four users attested to putting the pattern recognition potential of the Hadoop technology stack to business use.

Read more about Hadoop use cases

Read more on Big data analytics