This is a guest blog post by Jane Zavalishina, CEO, Yandex Data Factory
Well-established enterprises like retailers or manufacturing companies now have an abundance of data at their disposal. Unfortunately, merely possessing vast amounts of raw data does not lead directly to increased efficiency or the rapid development of new revenue streams. Instead, everyone must now figure out exactly how to make this data work for them.
Following in the footsteps of the internet giants – Google, Facebook and others – established enterprises are eager to invest in advanced analytics solutions to capitalise on the opportunities that possessing this data presents. To address this, an increasing number of businesses are deciding to bring machine learning in-house – introducing new departments and resources to accommodate. Others are choosing to collaborate with external teams to tackle the task. Regardless of the approach chosen, both bring a new distinct set of challenges to resolve.
The main challenge is revealed in the name of the discipline itself: “data science“. To succeed, enterprises need to merge two very different worlds – an economics-driven business and a scientific, data-driven department. While the cultural and organisational clashes are hard to avoid, they are rarely foreseen.
Here are few things to keep in mind:
Business is to set the goals, and it may be not easy
Decision-making in businesses is far from data-driven – with authority, persuasion and vision playing a significant role. Science, however, is based purely on evidence and experiments. Synthesizing these two approaches is the primary challenge when you start to work with data science.
Businesses will have to learn to formulate tasks based on what they want to predict or recommend, rather than “understanding” or “insights”. Despite being used to explicable measurements and disputable arguments, they will have to learn to work with uninterpretable results by managing them through correctly defined metrics. The task of translating the business problem into a mathematical statement, with precisely defined restrictions, and setting a goal in such a way that actually measures its influence on the business is an art in itself.
For example, if the goal is to improve the efficiency of marketing offers, it would be incorrect to task a data scientist with investigating the top ten reasons for refusals or delivering an innovative way to segment the audience. Instead, they should be tasked with building a “recommender” system that will optimise a meaningful sales metric: the margin of the individual order, the number of repeating purchases, or increase in sales of a specific product group. This must be identified beforehand based on the business’ strategic priorities.
The whole business will be affected, not just one department
When it comes to data science, it is wrong to think that a new means to solve a set of given tasks will be handled by one single department. The introduction of highly precise “black box” models will eventually affect company culture, the organisational structure and approaches to management – all of which must be taken into account to succeed.
One aspect of this is data access. Businesses should work on the ways to establish easy sharing of data which is too often siloed within each individual department.
Another is the ability to experiment. The only way to estimate the success of a machine learning model is by putting it to practice and measuring the effect as compared to the existing approach, isolating all other factors. However, running such A/B tests in an established business may not always be straightforward. For example, retailers aren’t going to just stop promotional activities in a few stores to have a baseline group for comparison. Preparing, organising and strictly following such procedures to measure the effect of data science applications on a business is part of the job and so will need to be integrated in the company DNA.
Last, but not least: managing blame. For decision automation to succeed, top management must support the initiative. However, once the models are put to work, it is impossible to place responsibility on any individual person any more – say, the one who used to sign off the demand forecast. Instead, the roles should be given new definition and new ways of assigning responsibilities and controlling the results, introduced.
Every data science project is a small research project
Leading an in-house data science team also requires a different approach to project management. Data science differs from software development, or other activities where the project can be broken down into pieces, and the progress easily monitored. Building machine learning models involves a trial-and-error approach, and it is impossible to track whether your model is, say, 60% or 70% done.
Businesses, on the other hand, are used to the following process: planning ahead, tracking the progress and looking at tangible intermediate results. With data science, it is no longer viable to plan a whole project and expect smooth movement towards the end goal. Project managers should instead be carefully planning quick iterations, keeping in mind that failure is always a possible outcome. Adjustment to a no-blame culture should be also part of the job.
As always with science, you cannot expect to make a discovery according to a plan. But managing a lab efficiently will deliver a product of predictable quality, and not just exploratory research. To succeed businesses will need to understand how to make this work for them, rather than transferring old project management guidelines to a new team.
You will have to get comfortable with a lack of understanding
Data science is a complex discipline, involving major chunks of statistics, probability theory, and, in essence, years of rigorous studies. Sadly, managers have very little chance of acquiring this knowledge quickly.
When acting as team leads, or as internal clients for data science, managers will need to get comfortable will this lack of technical knowledge. Instead, they should articulate success through results, according to defined experimental procedures put in place to check the quality of the models.
You cannot do everything in-house
Build or buy is the often question faced by organisations. Many businesses with stronger IT teams are very tempted to run an own data science department, sometimes not fully comprehending the challenge and the implications.
If data science is very close to the core business that is probably the right choice – it would be hard to imagine Amazon outsourcing it recommendations engine. But when you are an offline retail, factory, or bank, having data science in-house often equals building a separate R&D business. This presents new business challenges like competing with the internet giants when hiring rare talent. What’s more, many soon discover that either financial resources are unavailable, or is it strategically incorrect to add a new data science line of business.
At the other extreme, full outsourcing means lacking the expertise on a matter of major importance to the modern businesses. The answer is in combination. What businesses should strive to do is build teams that can tackle the challenges that are close to the core business, or require deep domain knowledge, while competently outsourcing the rest.
Developing this “data science purchasing” capacity is of outmost importance to always stay on the top of the game. You should be able to formulate the tasks and establish business cases, to easily test and compare quality of external algorithms, even probably running several different at a time, and switching to the best one without any disruption.
The role of a data scientist within a business is to access data and “extract value” from it, based on the objectives they have been given. The truth is, data scientists are not magicians. They must be given the right metrics, huge amount of data and the time to experiment to deliver the required outcomes, with failure still remaining a possible option.
This scientific approach is unusual in business, and understanding how to work with data scientists as a business leader is key to success.