This is a guest blogpot by Kasia Kulma, a senior data scientist at Mango Solutions
When we think of empathy in a career, we perhaps think of a nurse with a good bedside manner, or perhaps a particularly astute manager or HR professional. Data science is probably one of the last disciplines where empathy would seem to be important. However, this misconception is one that frequently leads to the failure of data science projects – a solution that technically works but doesn’t consider the problem from the business’ point of view. After all, empathy isn’t just about compassion or sympathy, it’s the ability to see a situation from someone else’s frame of reference.
To examine the role of empathy in data science, let’s take a step back and think about the goal of data science in general. At its core, data science in the enterprise context is aimed at empowering the business to make better, evidence-based decisions. Success with a data science project isn’t just about finding a solution that works, it’s about finding one that meets the following criteria:
- The project is completed on time, on budget, and with the features it originally set out to create
- The project meets business goals in an implementable and measurable way
- The project is used frequently by its intended audience, with the right support and information available
None of these are outcomes that can be achieved by a technical solution in isolation; instead, they require data scientists to approach the problem empathetically. Why? Because successful data science outcomes rely on actually understanding the business problem being solved, and having strong collaboration between the technical and business team to ensure everyone is on the same page – all of which is essential, and a key resource for getting senior stakeholder buy-in.
In short, empathy factors in throughout every stage of the process, helping create an idea of what success looks like and the business context behind that. Without this, a data scientist will not be able to understand the data in context, including some of the technical aspects such as what defines an outlier and subsequent treatment in data cleaning. The business process, even with less technical understanding, will have far better insight into why data may look “wrong” than a data scientist alone could ever guess at. Finally, empathy helps build trust – critically in getting the support of stakeholders early in the process, but then also in the deployment and evaluation stage.
Given the benefits, empathy is key in data science. To develop this skill, there are some simple techniques to drive more empathetic communication and successful outcomes. The three key questions that data scientists should be looking to answer are: “What do we want to achieve?” “How are we going to achieve it?” and “How can we make sure we deliver?”
What do we want to achieve?
For the first point, one approach is to apply agile development methodology to the different users of a potential solution and iterate to find the core problem – or problems – we want to solve. For each stakeholder, the data science function needs to consider what type of user they represent, what their goals are and why they want this – all in order to ensure they understand the context in which the solution needs to work. By ensuring that a solution addresses each of these users’ “stories”, data scientists are empathetically working to recognise the business context in their approach.
How are we going to achieve it?
Then it’s a case of how to go about achieving a successful outcome. One helpful way to think about it is to imagine that we are writing a function in our code: given our desired output, what are the necessary inputs? What operation does our function need to perform in order to turn one into the other? Yes, the “function” approach does not only apply to data, but also to the process of creating a solution. Data scientists should be looking at an input of “the things I need for a successful solution” a function for “how to do it” and then an output of the desired goal. For example, if the goal is to build a successful churn model, we need to consider high level inputs such as sign-off from relevant stakeholders, available resources and even budget agreements that might contain the project. Then, in the function stage, it may be time to discuss the budget and scope with senior figures, work out if additional resources need to be hired and any other items needed to drive the right output at the end. This can then be broken down into more detailed individual input-function-output processes to get desired outcomes. For example, working out if additional resources need to be hired can become a function output that will now have a new set of relevant inputs and actions driving the solution.
How can we make sure we deliver?
Finally, there are questions that need to be asked in every data science project, no matter what the scope or objective. In order to ensure that none of them are omitted, stakeholders should form a checklist, a strategy that has been successfully used in aviation or medical surgery to reduce failure. For example, preparing to build a solution that suits the target environment shouldn’t be a final consideration, but instead a foundational part of the planning of any data science project. Thus, a good checklist that data scientists could consider in the planning stage could include:
- Why is this solution important?
- How would you use it?
- What does your current solution look like?
- What other solutions have you tried?
- Who are the end-users?
- Who else would benefit from this solution?
Only with this input can data scientists build a deployable model or data tool that will actually work in context, designed for its eventual users rather than for use purely in a theoretical context.
Empathy may seem an unusual skill for a data scientist, however embracing this value fits into a wider need for a culture of data science within organisations, linking business and data science teams rather than keeping them in siloes. By encouraging dialogue and ensuring all data science projects are undertaken with the stakeholders in mind, data scientists have the best chance of building the most effective solutions for their businesses.