Buy or build afflicts data scientist capability

‘Data scientist’ once barely registered as a search term but is now close to a million web results. What is this job role, of which so much is said?

The term "data scientist" barely registered as a search term on Google in August 2010. It is now close to one million results. But a search on the UK website of Harvey Nash, a professional recruitment company, for data scientist jobs returned, today, zero results.

What is this job role, of which so much is said, and of which there is so little trace in the real world? 

According to a definition on Computer Weekly’s sister site SearchBusinessAnalytics, “data scientist is a job title for an employee or business intelligence (BI) consultant who excels at analysing data, particularly large amounts of data, to help a business gain a competitive edge."

But whether corporate organisations should recruit and develop their own data scientists is moot. The term has developed in tandem with "big data", and with a more general trend to squeeze commercial value from data. Its recent wildfire popularity as a "job that should exist" dates from around the beginning of 2012, according to consultants in the field.

Harvey Lewis, research director, analytics at Deloitte in London, says there has been an “increased level of attention among clients to data big and small in past year. In January last year, there was a lot of noise in the market, but not much of a sense of what to do. ‘How can put the customer at the centre of our business using data?’ is the key question’," he says.

And there is, he continues, a consequent intensified focus on the role of the data scientist. He confirms his firm has been having discussions about that with UK organisations, helping them recruit people who combine a statistics background with an understanding of business context. 

“Creativity is required in the understanding of data as it relates to business," Harvey Lewis says.

Data science a voyage of discovery

Joe Peppard, chair in information systems at Cranfield School of Management, cautions that data analytics projects cannot be treated like conventional large IT projects because of the necessary element of serendipity. 

The value of data analytics won’t be obvious from conventional business case submission processes, he says. 

“As a starting point”, he advises, “establish a data lab. That can be a project bringing together a cross-functional team of open-minded people to explore data."

He cautions against a belief that external consultancies can supply data scientific analysis: “This is not a commodity. It is very important to understand context." 

The example he gives in a recent Harvard Business Review article, co-authored by Donald Marchand, from IMD business school in Lausanne, is of HM Revenue & Customs (HMRC). The UK’s tax collection agency has been employing organisational psychologists as well as statisticians to figure out who and how to nudge into paying up. 

“At HMRC, analysts need to know when you take someone to court, and what you need to do to succeed in that," says Joe Peppard. "It’s about more than statistical understanding."

John Harris, chairman of the Corporate IT Forum, a membership organisation for business users of IT that includes HMRC, GlaxoSmithKline, and United Biscuits, among others, sees a data science skills gap as a thorny issue. 

If there is corporate gold to be dug up, where are the geologists to identify its location? CEOs and CIOs are struggling to find these people in the UK, he confirms. “Corporate organisations need to find data scientists with unique skills, mathematical, but with business knowledge and the imagination to ask the right questions. They won’t necessarily find them in IT at present."

The Forum’s Education and Skills Commission, which Harris chairs, encourages, “vocational programmes, in Sixth Forms and in universities, that challenge students to apply statistical analysis to real life problems. 

"Data science skills are a great example of an emerging capability that our education system needs to be ready for," he says. "There is now a plethora of data publicly available that students could work with, but businesses should also do more to ‘crowd source’, openly sharing data problems and challenges with schools and universities."

Data science: more creative than programming

Nick Halstead is the founder and chief technology officer (CTO) of San Francisco and Reading-based DataSift, which helps organisations improve their understanding and use of social media. 

The company took off as an offshoot of TweetMeme, the Twitter news feed service and has a heritage in RSS news aggregation. It helps companies mine social media data, such as tweets, Facebook posts and content on blogs. 

This British IT entrepreneur – who has been programming from the age of nine – believes that the “true value of big data lies in its highlighting of the term ‘data scientist’, which is getting pupils and students interested in mathematics. It is different to programming. It’s more creative than that. You need to be mathematical but also to be able to see a hypothesis: ‘what can I find in this data?’”

On the consultancy side, Narendra Mulani, global managing director, Accenture Analytics, counsels against an over-emphasis on user organisations developing their own data science capability. “I disagree with the data scientist vogue. Companies will compete on the consumption of analytics. It is fine to have a few data scientists. But you have to get businesses ready to enhance their decision-making. "How do I get insight into the right hands and are those hands – whether of a sales person or a nurse – ready to receive it?’”

From an IT supplier perspective, Nick Whitehead, solutions sales director for business intelligence at Oracle, also ventures that technology will industrialise analytics, taking away the pain for customers. 

“We can automate around the skills gap," says Nick Whitehead. "The sort of information discovery we make available through Endeca, for instance, or the social media analysis delivered through our Collective Intellect acquisition make the power of analytics accessible for business users."

He also reported that Oracle in EMEA is stepping up its recruitment of enterprise architects, a role which is logically prior to that of the data scientist.

On the user side, Mats-Olov Eriksson, director of data warehousing, at gaming site resists the seduction of the current vogue for data science, and echoes Whitehead on the importance of data architects. 

“It’s a shame that everyone speaks about data science as if that were the only sexy part about working with data," says Mats-Olov Eriksson. "The maintenance part might not seem as cool, but it is much more important. We need more architects who are interested in facilitating other people."

Read more on Business intelligence and analytics