R.Babakin - Fotolia

Hadoop-centric data science proves value at Danske Bank

Nordics bank Danske Bank has been using the Hortonworks distribution of Hadoop as a plank in its data science programme, which targets fraud, fills ATMs, and aims to combat customer delinquency

Danske Bank has added a Hadoop cluster to its IT architecture, and found benefits in combating fraud and the improvement of customer marketing.

Nadeem Gulzar, senior development manager at Danske Bank, said the organisation’s data science team has impressed the bank’s top leadership with the strategic value it bears, and could bear in future years. 

The bank, established in 1871 as the Danish Farmers’ Bank, is headquartered in Copenhagen, but operates across the Nordic countries region, as well as in the Baltics and Ireland.

Gulzar, who presented at the recent DataWorks Summit in Munich, said: “We started looking at Hadoop in 2014 with a small research team of three people, including me. With my team, we set the vision and got management’s support.

“In October 2015 we had a crucial workshop with the executive board – something which had never happened before in 150 years. We shared our vision, strategy, tactics and approach. On our final slide we outlined the cost. Thomas Borgen, our CEO, said: ‘Is it only that amount?’

“Of course, we’ve not had IT for 150 years, but we had it for 30 to 40 years, and the CEO said this was the first time he’d seen such optimism and commitment for the business from IT.”

The data science effort at the bank was starting to deliver value in 2016, he said. The team now stands at 60 and counting, but Gulzar added that there is much work still to be done. “Even though we have a strategy for the data lake, it’s not yet going as fast as we’d like,” he said.

The data lake includes, architecturally, an IBM mainframe and a Microsoft SQL Server data warehouse, as well as a Hadoop cluster from Hortonworks.

The Hadoop cluster has been put to use for fraud detection, using transactional data. “Here, we have reduced the amount of false positives by around 90%. From the customer’s perspective, it improves their experience with Danske Bank as they only get contacted by our investigation team, when a real fraudulent transaction occurs,” said Gulzar.

Read more about Hadoop use in banking

The system has also played a role in supporting the bank’s maintenance and operations department by using machine logs to predict when ATMs, which are equipped with sensors, need to be replenished with cash rather than doing so according to a schedule.

According to Gulzar, there has also been a marketing and sales benefit: “Thanks to the analytics, we have seen a 62% increase in customers responding to the first communication from our marketing team,” he said.

Another future application, leveraging AI, could be used in identifying customers about to become “delinquent” because of a life-changing event such as a job loss or a divorce. Getting to such people before they churn from the bank means deploying a deep learning neural network, but Gulzar said this is still in early stages.

Bringing business on board

The Hadoop-centric data science programme at the bank has not been plain sailing, said Gulzar. “It was a very tough job to convince all areas of the business that HDP [Hortonworks Data Platform] is of value to our business. It took pioneering efforts to outline how the customer experience would be improved.”

Finding and recruiting data science talent has also been difficult. Copenhagen University, the Technical University of Denmark, and ITU – the IT university in the capital – have been happy hunting grounds.

“But, of course, it is difficult. You could go with the approach of bringing in a senior, expensive data scientist with deep domain knowledge in banking – but that would be the equivalent of five of the people I have,” said Gulzar. 

“So, our approach has been to get people who understand the tooling, the methods, how to do machine and deep learning, and add the domain knowledge. A senior guy would leave you in a year or less.”

Danske Bank chose Hortonworks, said Gulzar, because of its proximity to the pure open-source Apache Hadoop family of software. They looked at alternative distributions, such Cloudera or MapR, but “wanted a distribution that was in sync with what is happening in the open-source community, with no commercial components, but with support to help us build our technical infrastructure and business cases.”

The decision was made before the current COO and head of IT Jim Ditmore came on board in 2014.

“In the past three years, I have been asked by our COO multiple times: ‘Is this the right platform? Is Hortonworks the right partner? Can we believe in Hortonworks?’. He is sure it is, but for him, he’d like to jump into a world where this environment is fully enterprise ready, fully hardened,” said Gulzar.

“The fact is we need a partner who is able to follow the market as soon as new things happen, who can immediately adapt, so we don’t lose momentum,” he said.

Read more on Information technology (IT) in Benelux