2014 has seen a steady move to the mainstream of the characteristic themes of the big data movement: ways and means of dealing with unstructured data; how to operationalize big data analytics; how to build up a data science capability.
One of the stimulating conversations I've had about the big data phenomenon this year was with Andrew Jennings, FICO's chief analytics officer and head of FICO Labs.
He has held a number of leadership positions at the credit scoring outfit since joining the company in 1994, including a stint as director of FICO's European operations. Andrew was head of unsecured credit risk for Abbey National and has an academic hinterland as a lecturer in economics and econometrics at the University of Nottingham.
Did he think the big data cycle was winding down? He did, but it has not become less relevant, he said, "but we are over the more outrageous things said in the last year or so. The Wild West of Pig, Hive and Hadoop has become more tamed; it's being made easier for people to use [big data technologies] in order to do analytics with data, and make decisions".
Dr Jennings was, in part, referring to his own company's analytic cloud service, which comprises open source tools, their own products, and other third party elements. But also efforts being made by the big suppliers, such as IBM, SAP and Oracle.
"Data driven organisations do need more tools beyond the spread sheet, so there is more tendency for big data technologies to be integrated".
Jennings sees the predictive analytics technologies developed over many years for financial services companies, by the likes of FICO or SAS, as having a broader applicability, and cites network security as an adjacent area.
"And in retail, the credit risk models developed over 25 years can be extended to the best action to take for an individual consumer", depending on their price sensitivity.
FICO is experienced in recruiting and developing people it is now fashionable to call 'data scientists'. Does he think such people should get more focus than the development of data savvy managers?
"Data scientists will get frustrated if management around them has no understanding of what they are doing. So, they need data savvy managers, too".
On data scientists, as such, he said "by 'data scientist' people mean something more than a statistician or a computer scientist who knows about database technologies, but someone with a broader set of skills: who can manipulate data, write code, hack something together, do the mathematical analysis but also understand the business context in which that is important.
"In my experience the really hard role to fill is that [data analytics] person who can also understand what the business goals are and can talk to the business. Can help us to sell, in FICO's case".
The rarity of such people means that building data science teams is the way forward, he concludes.
"It always comes down to: 'What's the decision I am trying to improve?' Or 'what operation am I trying to improve?'".
FICO's approach, on his account, is to make decisions easier and repeatable. "You've got to be able deploy the model. We put our time not just into the algorithm, but to getting the model to the point at which a decision can made and you can execute at the right speed".
As for big data technologies, he said "I've been in analytics for years, and had never heard of Hadoop five year ago. It is now in our everyday language. All the big players - Oracle, SAP, and so on - are moving to make it less geeky. We're focused on the analytics and decisioning component of that".