Sentiment analysis at work: a sentimental education for the data rich

Marketing services company Mindshare UK and hotel chain Marriott International are mining social media to analyse customer sentiment. But a need for human interpretation persists.

This article can also be found in the Premium Editorial Download: IT in Europe: Social media mining yields customer analysis

The Cannes Lions International Advertising Festival is one of the most prestigious events in the advertising industry's calendar. Last year, one lucky winner of the coveted Gold Lion award was Mindshare, a marketing and media network that forms part of the global advertising giant WPP.

Mindshare UK scooped the award for a customer-recruitment campaign it devised on behalf of retail bank First Direct. Online forums, blogs, comment threads and social networks such as Twitter were mined to hear what existing First Direct customers were saying about the bank – both good and bad. These comments were then broadcast as live digital advertising on the London Underground, in train stations and in shopping centres.

It was a daring campaign at a time when consumer trust and confidence in the banking sector were in the doldrums – not to mention a sophisticated use of business intelligence (BI) tools and techniques.

But Mindshare has a wealth of experience in this area, according to Mark Bulling, business director at the company. “Clients have long needed to understand why one campaign is better received than another, why it reaps better results in a particular geography, or via a particular channel, or when targeted at a particular customer segment, so we’ve always had to stay up to date with new ways of collecting and slicing data,” he said.

“And in the current economy, there’s more pressure than ever to be able to provide that data, so that clients can be confident about justifying their advertising spend.”

Much of the data analysis effort at Mindshare is in the hands of Andrew Corroll, the company’s head of data integration. Like many other BI professionals, he’s finding that social media provides a potential treasure trove of new data but, at the same time, presents some interesting challenges

“It’s not the quantitative analysis of social media sources that’s the problem,” he said. “Thanks to Twitter APIs [application programming interfaces] and the like, it’s not so difficult to measure the number of online mentions of a product that a campaign generates, or to identify exactly which bloggers are the most influential in terms of generating comments about a product or brand.

What’s more difficult, he continued, is identifying the mood expressed in an online comment – whether it is positive, negative or neutral. “That’s a much harder thing to do,” Corroll said. “With data collected from social media sites, we’re increasingly exploring the structure of grammar – how words are connected to convey meaning, for example – which is a totally different kind of analysis. It’s less quantitative and more qualitative.”

“We’re increasingly exploring the structure of grammar – how words are connected to convey meaning, for example – which is a totally different kind of analysis. It’s less quantitative and more qualitative.”

Andrew Corroll, Mindshare

This is where sentiment analysis comes in. The technology, in itself, is not new. Large consumer brands have been using sentiment analytics technologies for some time. These apply natural language processing (NLP), computational linguistics and text analysis to unstructured data in order to gauge the ‘tone’ of conversations.

What is new is the application of these technologies to information collected from social media sources, not to mention the sheer volume of such data this kind of data that now exists online.

For general BI, Mindshare works with a variety of different tools from different vendors – but, for now, its strategy for sentiment analysis is firmly based on open-source scripting languages, including Python and R.

“Both of these scripting languages lend themselves well to the drawing down of text from social media sites and the application of textual analysis processes to it,” said Corroll. Python, for example, includes the Natural Language Toolkit (NLTK), a suite of libraries and programmes for symbolic and statistical NLP.

The hand-crafted approach that Mindshare uses may demand more technical skills, but it offers the company the most cost-efficient way to analyse text for sentiment, said Corroll.

Sentiment analysis off the shelf

Other companies are starting to invest in commercial tools for sentiment analysis. In many cases,these are still standalone. But increasingly, BI software suppliersare developing social media analytics suites that combine quantitative analysis tools – such as volume of conversation and share of voice – with tools for qualitative analysis, including sentiment analysis.

Hotel chain Marriott International, for example, has been working with one of the biggest BI companies on the first version of its social media analysis platform, which was launched last year. One of the most important goals of this collaboration was to get an insight into how Marriott’s brand is perceived online, what kind of sentiment surrounds the brand and who the brand’s most influential advocates are, according to Mike Keppler, senior vice president at Marriott International.

Speaking at a software user conference last year, Keppler outlined the findings of this experiment. He was impressed, for example, by the fact that Marriott was able to pick as many blogs, Twitter streams, PR newswires and Web crawlers to analyse as it wanted, he said.

“In general, working with a limited data set, we found very positive sentiment and more important, we found very little negative sentiment,” he said. The software also maintains a continuous archive of conversations for ongoing analysis, enabling companies to analyse how sentiment changes over time, and it provides multi-language support – important for multinational companies for whom a negative customer comment in French is as valid as one in English

Marriott was also able to identify which high-profile bloggers regularly posted positively about its brand. “We have a voice as a company, but we also found that there were other players that were talking about Marriott," Keppler said.

Many industry watchers, however, point out that sentiment analysis is still a complex and uncertain activity While the technology is designed to pinpoint significant words in unstructured data as a means of evaluating, for instance, customer reaction to a new product or service, the complexity of human conversation makes accuracy difficult. Comments that are sarcastic or unpredictable, or that make specific cultural references, can be difficult for humans to interpret, let alone machines. Sentiment can also be swayed by temporary factors, such as the poster’s mood at the time of writing.

As a result, most sentiment analysis applied to social media data still requires human intervention for findings to be interpreted and validated before any action is taken.

Social media analytics in silos

And there are wider issues with the first generation of social media analytics software, said James Kobielus, an analyst with IT market research company Forrester Research. In many cases, he said, these suites remain silos, “separate from existing BI, data warehousing, predictive analytics, complex event processing and data integration tools. Few vendors provide best-of-breed integrated tools in all of these areas, and the high price tag and scarcity of skilled development and modeling personnel who can work with these technologies spell high total cost of ownership for the unwary.”

So, before any consumer brand pitches headlong into attempting to analyse what customers are saying about it online, there are important decisions to be made. Can the company afford to monitor all social media traffic, given the persistent need for some kind of human interpretational effort? Can it afford to respond to all the issues that are identified, or should it postpone dealing with some issues and tackle a prioritized subset of issues first? And, above all, should it consider putting social media ‘listening’ in the hands of a skilled outsourcer, given the need for human intervention and the potential costs involved?

Until these issues are resolved, companies might struggle to ‘hear’ true sentiment above the general babble of online conversations.

Jessica Twentyman is an experienced business and technology journalist. Over the last 15 years, she has been a regular contributor to some of the most respected UK national newspapers and trade magazines, including the Financial Times, Director magazine and Computer Weekly. The bulk of her work focuses on the technology products and services that smart companies use for real competitive advantage.

Read more on Business intelligence and analytics