How have corporate business intelligence (BI) strategies evolved in recent years in the face of big data? Have companies and other organisations changed their BI tool choices and the ways they set up their teams and technical architectures as data volumes have increased and become less structured, more “messy”?
Here, we first get a CEO-level perspective from two founder-executives at opposite ends of the spectrum of opinion, then some counsel from an analyst perspective, followed by testimony from three user organisations: building society Nationwide, investment management company Schroders and games company King.
Three waves of BI or startup special pleading?
Frank Bien, CEO of business intelligence platform supplier Looker, laid out the thesis of his company to Computer Weekly in 2017, contending that the rise of Hadoop and NoSQL databases had superseded prior generations of BI technology.
Bien’s narrative of business intelligence involves three waves of BI. The first was the big monolithic stacks: Business Objects, Cognos and Microstrategy. “What you got there were complete systems – and you spent a lot of time ‘manicuring the lawn’. By that I mean databases were slow, and were built to do transactions, not analytics. When you wanted to ask a question, you had to reorganise the data physically,” he says. And that, he maintains, became rigid and inflexible.
The second phase was one of “blowing up that stack”, about seven years ago. “There were small tools raining out of the sky to do things separately, like data preparation or visualisation. And, as vendors, we said to customers: ‘You put all that together’,” he says. This, in his view, was the era of Qlik and Tableau.
“At the same time, there was a revolution in data infrastructure, with technologies coming out of Google, Facebook and so on. Then the cloud happened, with Amazon Redshift, Microsoft Azure, and so on, and it became trivial to store everything. So, that second wave of BI tools was evolving while there was a complete revolution underneath it, and the tools did not catch up,” says Bien.
“And so there is a third wave, where there is a reconstitution of a complete platform, but working in this new data world,” he adds. Which is where, in Bien’s view, Looker comes in.
Michael Saylor, CEO and founder of one of the most established business intelligence firms, Microstrategy, disagrees with this analysis. In an interview with Computer Weekly in 2017, he countered: “I think any startup needs a narrative. It is true that there are stages, and companies have to grow and evolve or they get left behind. But a more useful metaphor is that of an expanding universe. The world is not shifting from relational to big data, it is expanding in different dimensions simultaneously.
“You still see a lot of Oracle, SQL and Teradata. There are MDX language sources – OLAP cubes. And there are the Hadoop distributions. Those are three data platforms, and no one of them is going to make the others go away. You can go to the top 5,000 companies on earth and find they are using them all.
Read more about business intelligence strategy
- Business intelligence and analytics: how to develop a complementary strategy.
“And it goes on. There are applications like Salesforce, Workday, SAP, and so on – people will want to query the data in those directly. And there are web sources. None of those will replace each other, either. It’s more likely that a Coca-Cola will want to join data in Hadoop, Salesforce and Oracle,” he said.
“Now you can build a small or mid-sized company solving a subset of the data problem. Every company has to decide where to make its investments. Some BI company might come along and say, ‘We are the best for the Hortonworks distribution of Hadoop’, and that might fly for a while. But I have been in this business for 27 years, and every three years there is a new data technology which is the rage.”
Complexity, complexity, they’ve all got complexity
So, what is the state of play now, in 2019? Mike Ferguson, one of the most high-profile independent analysts in the field, takes the view that the data and business intelligence problems large companies, especially, have today are less about BI tooling choices and more about data integration. Much more. The main problem companies – and other large organisations – face, he says, is complexity.
“Creating a data-driven organisation is more difficult than people think. I always go back to that Peter Drucker comment, ‘culture eats strategy for breakfast’. It is easier with SMEs [small to medium-sized enterprises], but if a larger organisation has no chief data officer (CDO), or similar, you get lack of organisational alignment,” he says.
“Companies which are doing well are strongly led by their CEOs, and really clear about the priority to be given to data. The problem always seems to lie in the middle, with the cultural issues of people and processes,” adds Ferguson.
Mike Ferguson, independent analyst
In technical architecture terms, he says the idea that companies should bring all their data to one Hadoop system, in a centralised data lake, may be feasible for mid-sized enterprises, but there is such a deluge of data now – machine data, internet of things (IoT) data, social network data, open government data, data from commercial providers – it is difficult to integrate.
“Everywhere you look, there are disconnected pools of data [in corporate organisations]. And you think, ‘Could you not just do this once, and then we could all re-use it?’. So the objective is not to prepare all data for all users, but to incrementally build up a set of ready-made datasets that people can pick up and re-use. There are big organisations very interested in this. And so you would have a group of data stores dedicated to ingestion, a group dedicated to cleaning and preparing stuff, and a group holding trusted stuff,” says Ferguson.
As for business intelligence software, he says it is being “swamped by data science”, pushed to one side by the now more fashionable topic.
“I think the issue here is twofold. BI has been advancing in the form of artificial intelligence going into the tools themselves to improve productivity. Say, recommending you to click on the right visualisation for the problem you’re trying to solve. The other being simplified interaction with these tools, with natural language processing and chatbot interfaces. But the problem in general with BI is it’s being swamped by so much money going into data science tools,” says Ferguson.
“And there is a war going on with the cloud vendors trying to get people onto their machine learning-based services for analytics. So it is a very fractured world, with BI left to one side a bit. Also, skills are too thinly spread across too many data science technologies. It is chaotic.
“The integration of BI with deployed models from data science is a big area where people want to see integration – with predictions, alerts, forecasts and recommendations made easy to access. And BI vendors who focus on that kind of thing will help,” he says.
How are corporate organisations rendering this complexity tractable, by way of aligning and streamlining their data and analytics strategies?
Paul French, director of business intelligence, visualisation and reporting, data and analytics, at building society Nationwide, explained his company’s approach in a briefing with Computer Weekly at the Gartner Data & Analytics conference in London earlier this year.
In part, Nationwide’s strategy was informed by the way airline pilots are prepared to be “fit to fly” – a process French’s own stepson has recently gone through. You don’t just climb into the cockpit of a jet and press “Go”.
French described how the building society has centralised its data team, under a mandate from CEO Joe Garner, appointed in July 2016. French reports to CDO Lee Raybould. Garner is aiming, says French, to move the society from a hierarchical, top-down culture to a more “accountable freedom” environment.
“From a data perspective, that has been perfect, in terms of supporting that shift by changing our data culture,” he says. “If you want people to be able to make decisions throughout the organisation in a spirit of accountable freedom, then empowering them with the right data, and the right levels of confidence in literacy, is vital. Essentially, we are moving from being data constrained to being data enabled.”
As part of that, Nationwide centralised its data governance, business intelligence, data warehousing and data lakes, and data science staff into one team – presently of 180 people, but recruiting to expand to around 300 in the near future – reporting to the CDO function inaugurated by the society’s CEO.
That central data function found that 50% of the workers at Nationwide were spending the majority of their time on data preparation work.
“Our biggest challenge was we had Excel everywhere, Access everywhere, and SAS being used for the wrong things. SAS is a really great product when used for the right thing, but it was being used for a lot of the wrong things – and understandably, because the business teams that were building stuff in SAS were not getting a central service from the data or IT teams. So they found a way, with tools that were available to them. But we are in a place now where we need to have stronger governance and control, while enabling self-service,” says French.
“Essentially, we have a central data science team that both develops advanced analytics models to support the business, but also helps grow capability and best practice in teams across the society. We operate a hub and spoke model that means we benefit from a central team driving best practice, standards, capability development, technology advancement and focusing on the big, business-wide opportunities, coupled with spoke teams in areas such as marketing and risk which have domain expertise in their business areas.”
He gives an example of how SAS specifically is now being used. SAS Visual Text Analytics allows the firm to experiment with natural language processing. This is helping to determine if it could better understand the root cause of customer contact through browser-based messages from within its internet bank channel, to service members better.
“We’ve a range of these types of opportunities, where we are exploring where richer amounts of data, advances in data science and analytics technologies, and a strong investment in our people – all part of our D&A [data and analytics] strategy – are enabling us to continue to improve the service offering to our members,” says French.
From a technology architecture perspective, the Nationwide data stack includes a Teradata data warehouse appliance with a Hortonworks Hadoop data lake connected to it. Before it decided on that warehouse and lake setup, it had Microsoft SQL Server instances spread through the organisation, he says.
“We are exploring where richer amounts of data, advances in data science and analytics technologies, and a strong investment in our people are enabling us to continue to improve the service offering to our members”
Paul French, Nationwide
On the BI side, Nationwide has taken a multi-supplier approach. “I’m of the view that there is no single BI tool that takes care of all our use cases. We have SAP Business Objects for very structured, static reporting, and have had for 10 years. We use QlikView for our dashboarding and guided data discovery, and have signed a licence for [the more advanced] QlikSense for up to 5,000 users,” says French.
Qlik was originally a departmental solution for the commercial team around 10 years ago, and it has accelerated Qlik use in line with the more recent wave of data and analytics strategy in the past three years.
Nationwide also uses ThoughtSpot – a search-based BI tool requiring little training – in the hands of its front-line employees. “My view is 75% to 80% of people in any business need simple information in a simple, intuitive way. ThoughtSpot provides a natural language-based interface that you can get someone, in a branch or contact centre, up and running with in half an hour’s training,” he says.
But the BI tooling is, he adds, a small piece of the puzzle compared with changing the data culture, which it has been doing with events like data speed dating – where staff from outside the data team can have time with data practitioners – and, most recently, its first hackathon. This, he says, was sponsored by someone who leads the contact centre change team and was organised around a simple question: What is the impact when a member contacts us via the phone?
Some 270 people got involved in the four-week (dispersed in time and space) hackathon, working with Cisco call data, ranging in personnel from data science modelling teams to an area manager. “It’s really engaged people, and has started providing insight on a business area we are interested in,” he says. And using, it seems, precisely the “messy” kind of data that lies beyond the numbers sitting in the neat rows and columns of a relational database – in this case, audio.
Asset management firm Schroders, set up in 1804, is another financial services firm that has been modernising its data and analytics strategy. Whereas Nationwide is a Qlik customer, Schroders is an aficionado of its close rival, Tableau.
Mike Renwick, head of data and insights technology at Schroders, gives this account of the problem the firm was trying to solve when, four years ago, it went looking for something like the Seattle-based data visualisation software supplier’s wares.
“Tableau, for us, was always about increasing the surface area of people who were able to solve their own problems in the line of business, with guidance and support,” he tells Computer Weekly. “In the past, technology teams tended to hold the monopoly for building reporting, and the tight interaction required made it difficult for business users to get to the point where they were genuinely asking and answering questions of data.
“There is an interpretation layer between requirements gathering and implementation that meant the people who best understood the data were having to ask others who understood the technology better to do it for them. Tableau shortens that distance and means a finance team can directly interrogate the data themselves with their own data expertise, to find answers to questions in the moment.”
Does he think older BI tools that classically matched up with data warehousing are not such a good fit for big, unstructured or less structured data?
“I think they sometimes can be a good fit, although there is definitely a gap in user-friendly processing of unstructured data for consumption in a BI tool. Unstructured data needs some sort of structuring to be usefully processed – if you have your email inbox as a data source, you would need to turn it into some sort of structured shape to make use of the data,” says Renwick.
“Tableau is quite interesting, due to its web data connector idea – something quite simple that allows you to write an adaptor from some unstructured source, for example, into structured Tableau data,” he adds.
“Kalpana Chari, capability lead of the Knowledge Analytics team at Schroders, and her team used this to build a connector to ElasticSearch, that allowed our users to look for specific terms appearing in meeting notes, and then see how they trend on a graph – think Google Trends for internal meeting notes. The result is visualised in Tableau, and these entry points into the ecosystem meant it could be extended for this kind of use-case,” says Renwick.
“Our engineers and scientists work closely together. Our engineers do a brilliant job of building robust tools and thinking about the whole system. This gives our data scientists more time to answer difficult questions, explore unique data and apply scientific rigour to business decisions”
Mike Renwick, Schroders
Describing how the overall data and analytics programme is organised at Schroders and how it has set up its data engineers and scientists to work together, he says the data science team – the Data Insights Unit – tends to be interested in different types of technology to those used in core systems in the business.
“Due to their unique requirements, they help to shift existing thinking around data technology towards more scalable and open models,” says Renwick. “In businesses like ours, which sell to other businesses rather than consumers, it’s not common to encounter datasets with billions of rows of data. However, our data scientists are often dealing with datasets approaching 100 billion rows. This means traditional data analysis techniques can’t be used and the skills of a computer programmer and statistician are needed, as well as access to the big data or cloud technologies to handle data of this volume.
“Our engineers and scientists work closely together, using their different and complementary skills in partnership. Our engineers do a brilliant job of building robust tools and thinking about the whole system. This leads to our data scientists being able to spend more time answering difficult questions, exploring unique data and applying scientific rigour to business decisions for the entire firm,” he says.
And on the question of whether its efforts are centralised, decentralised or a bit of both, Renwick confirms it is a “bit of both”.
“We see a central core of experts as a useful starting point, but have made a point of both connecting to internal initiatives that in a progressive organisation can, and should, be quite emergent and seeding teams with embedded data professionals – think of them like field operatives who get deeply engrossed in specific business lines, but benefit from the broader community of data science and engineering professionals in other areas. Creating some movement here is valuable – jobs in-situ can become repetitive after several years, so having a joined-up ecosystem of this community allows for people to rotate or specialise, or indeed, generalise.
On the top-level business benefits of its use of Tableau, visible at board level, Renwick says it has proved that it has delivered an eight times return on the costs.
“One story involved our investment operations support team that use Tableau dashboards they built to check data quality ahead of trading. It is a mundane-sounding use-case, but has a materially positive impact, lowering errors and omissions and with bottom-line impact,” he says.
A BI strategy is more than a list of tools
While Nationwide, Schroders and King (see box below) all boast clear data and analytics strategies, Andy Bitterer, now an “evangelist” at SAP, but a long-time data and BI-focused analyst, including at Gartner, thinks this is all too uncommon.
Speaking at the co-located Enterprise Data and Business Intelligence & Analytics conferences in November 2018, he said: “In my previous role at Gartner, we used to run an annual survey among 3,000 or so CIOs worldwide. We would ask them what their major objectives were for the next 12 months, and business analytics was always near the top or top. You would think that if that was the case, you would have a strategy around that. But if you ask them about their BI strategy, they usually talk about reporting, which won’t get you to digital transformation.
“A BI strategy is not just a list of the tools that we want to have. Many times I have asked users, ‘What’s your BI strategy?’, and they have replied with the name of a vendor. That’s not a strategy. That’s like Ferrari saying, ‘Our Formula 1 racing strategy is we’re going to use red paint, Bridgestone tyres and Esso fuel, and drive really fast in a circle’.”
Ian Thompson, principal engineer for business intelligence (BI) at Candy Crush maker King, talks about his company’s use of Looker.
What were the business and technical problems you were looking to solve when you went looking for something like Looker?
After roughly five years reporting through the same tool and a rapid expansion due to the success of Candy Crush Saga, our staffing and data needs were faced with two major obstacles in the way users interacted with our main BI tool. The first was that our current set of reports/dashboards were grinding to a halt. The reports created by a centralised BI team were trying to cater to the masses and tick as many boxes as possible for as many users as possible.
This resulted in everyone paying the price for the volume of data required in these reports. Soon, several versions of the same reports were being released, but segmented in as many ways as possible, still allowing a range of different users to answer the questions they were asking of the data but with a reasonable user experience, which is obviously not ideal for several reasons.
The other huge issue we saw was that people were not interacting with the data in our BI tool – they would mostly export huge volumes of data and move to their preferred tool of choice to work, either unaware or unable to complete that task in a cleaner, more reusable and robust way inside the tool itself.
The term “self-service BI” became very popular, and after working even more closely with our stakeholders to make this term become reality, we finally realised the barrier to entry for the majority of our users was too high and, more importantly, the desire to learn how to use this tool was either lost or didn’t exist.
The combination of these factors meant that we would have to produce more specific data models and reports for our stakeholders. With many of these stakeholders unable to help build and create these in our BI tool, the centralised team would be too stretched and we became aware of our need to devolve part or all of the process of producing data exploration or reporting to those stakeholders who know best what they require.
Do you think the older BI tools that classically matched up with data warehousing are not a good fit for big, unstructured/less structured data?
Looker CEO Frank Bien gives a very interesting presentation of the first, second and third waves of BI, and you can see many of the audience nodding along as this is all too familiar to them.
Without wishing to describe “big data”, it very quickly becomes more important to collect data for everything in the chance it has some use. Generally speaking, teams and skills adapted faster than the tooling and users with this new vast amount of information seamlessly at their hands soon found fault with what had been so reliable and performant for such a long time.
“Many older BI tools coupled with warehousing are still very good at handling data cubes and reporting, but when you delve deeper into the stray, unloved and unclean data, the overhead and constraints you can face make you realise you might have to step away from what has been known and functional for years”
Ian Thompson, King
Many of the older BI tools coupled with warehousing are still very good at handling data cubes and reporting, but like we and many others have found, when you wish to delve deeper into the stray, unloved and unclean data, the overhead and constraints you can face make you realise you might have to step away from what has been known and functional for many years.
Why did you choose Looker?
A small team ran a comparison between five data exploration tools. The focus was on “data exploration”, but also trying to move away from in-memory tools due to our data volumes and, arguably most importantly, something that our stakeholders would pick up quickly and get their hands dirty with.
Looker coupled with our MPP [massively parallel processing] database worked very well and surprised many of the data team that analysis could be run in real time. Many of us knew this would be the start of the end for producing highly spec’d data cubes for each use case and having to manage the processing and scheduling of this data each and every morning. This was enough for us to take the winner of the evaluation – Looker – to a trial with a stakeholder team.
Looking back at the evaluation scorecard now, I can see that it scored lower than the majority in the “Usability” category, specifically around the development side. I find this very amusing because we had obviously under-estimated how and what our users would be capable of doing with Looker and hugely underestimated their desire to move to a new tool and realise “self-service BI”.
When did you first start using it?
In mid-2016, while searching around for a stakeholder team to trial Looker on, we came across a team which had grown so disillusioned with our current setup that it was in talks with another vendor to create its own BI stack. Many of the team had previous experience with this tool, and while they may still have had the data volumes issue, they understood and were prepared to create reports and dashboards using this tool.
We bought ourselves two weeks and spent a couple of days building a small proof of concept for the team. By the end of the trial, we were staggered by the amount and quality of content they had created, as well as model changes made by a few power users who had emerged. After the trial, the team was fully brought in and didn’t wish to go back, and we moved around the business capturing teams one at a time with similar success. Last month, we switched off our legacy BI tool, which has contained only financial reporting for the past year.
Describe your data science and data engineering teams: how big, what their cultures are like
I am lucky to be working in a very data-savvy company. I have spent my time making a real impact with data rather than persuading users of its benefits. Our core data engineering team, which maintains the platforms, data ingestion, ETL [extract, transform, load] and BI, is roughly 25 people. There are between two and 10 data analysts/scientists in each stakeholder team (finance, marketing, game teams, human resources, etc). Different technologies and methods are rarely restricted and roles and remits are very fluid. There are also a handful of other engineering teams which either heavily rely on data or also chip in to create data-based products.
What’s the organisational design of your analytics setup? Centralised, decentralised, or a bit of both?
A bit of both. The data platform and data ingestion team centralise this function to give the company a good data foundation. From there, it becomes slowly more decentralised. Our core ETL team cover most bases with the data products they produce but we have many data analysts/scientists or even other teams creating data models. The BI platform is also centralised, however data models and content inside is all created, owned and maintained by stakeholders, with the assistance of the core data teams. There are exceptions to this, of course, which is how we experiment and innovate.
Is data integration more important than BI tooling when it comes to your data and analytics strategy?
I would say yes. With the data teams and capability we have in King, this is not as a hard decision as it might be in a smaller company where it might be an either/or situation. We often make use of disparate datasets elsewhere other than our BI tool and we try to employ best engineering practices of bringing all our data into a single warehouse.
What have been the top-level business benefits of your use of Looker?
A lot of people have saved a lot of time that they have been able to invest elsewhere, mostly exploring data. We have put the power into our users’ hands, who know best what they need. We have been able to share our analytics internally and externally with our partners in a more secure and efficient way.