Case study: Yellow Pages digitises records for the web

Canada's Yellow Pages is investing in a digitisation programme to compete with online information providers, such as Google, Facebook and Yell

This article can also be found in the Premium Editorial Download: Computer Weekly: What is being said about you on social media?

Yellow Pages in Canada is investing over $2.5m (Canadian) in a digitisation programme that will give it the ability to compete with global information providers, including Google, Facebook and Yell, on the web.

The company is using sophisticated data management technology to reuse data originally collected for printed business directories to provide what it claims are the most accurate and comprehensive Canadian business search services on the web.

The project will enable Yellow Pages to respond to changes in business data in seconds, rather than days, ensuring that people searching online get the most up-to-date information on Canadian companies, says Andre Boisvert, chief architect at Yellow Pages Canada.

“Our systems were built as an appendix to our print systems, which were, in nature, very much oriented towards the print business. They were very sequential and batch in nature. And the content was optimised to serve print publishing systems, but not online,” says Boisvert in an interview with Computer Weekly.

Preparing IT systems to manage big data

Yellow Pages began a review of its IT systems a year and a half ago, to help it compete head-on with online search services.

The company wanted to ensure that it could accurately combine information from its own databases with social media and other web services to create the most comprehensive and accurate data about Canadian businesses available.

Yellow Pages’ print background, however, meant that it had multiple records of each business, often recorded in slightly different ways, for different business directories.

It needed to find a way of ensuring it could accurately identify the business web users were looking for and to ensure that it did not confuse companies listed more than once as separate organisations.

The company evaluated middleware systems from a range of suppliers in May 2013, before choosing a solution from Tibco. It began rolling out Tibco’s master data management (MDM) software in March last year, before going live in September.

The speed offered by Tibco’s software was a critical factor in the decision, says Boisvert.

Yellow Pages receives tens of thousands of updates a day, and each one needs to be matched against its database of 1.6 million entries, using 12 different matching rules.

The Tibco technology, which runs on IBM Linux servers, enables Yellow Pages to identify businesses that consumers are searching for and to match them accurately with data from third-party services and its own databases.

For example, if a consumer is looking for a local pizza restaurant, Yellow Pages is able to give the customer reviews of the restaurant from third-party services such as Open Table or Trip Advisor, or social media services such as FourSquare.

The system is sophisticated enough to work out an address of a business if, for example, people say on social media that they are on the restaurant on the corner of Peel Street and St Catherine’s, says Boisvert.

“One would believe it’s fairly easy to match entries, if you have the name, address and phone number. But people have abbreviations, incomplete information or misspelled information,” he says.

Fast MDM roll-out

Typically, master data management roll-outs take 18 months, but Yellow Pages was able to complete the project in six months, by taking some calculated risks, says Boisvert.

“Projects often fail the first time, and so did we. So our concept was, let’s fail, but let’s fail as fast as possible, rather than spending a lot of time to fail the first time,” he says.

Finding people with the right skills was a challenge. People with expertise in Tibco technology were hard to find, says Boisvert.

The company hired Logimethods, a Canadian systems integrator which specialises in Tibco technology, to help with the implementation. It also brought in consultants from Tibco.

Tuning the MDM to give the fastest performance was a technical headache, Boisvert reveals.

"We went through a number of iterations, because the initial iterations did not produce the response times we were expecting," he says.

Comprehensive business data

The system has ensured that Yellow Pages is able to offer more accurate and more detailed information about businesses in Canada than online competitors, says Boisvert.

In cases where identity of a business is uncertain, the MDM refers queries to a team of 40 data stewards, who can intervene manually to identify the company the searcher is looking for.

"For us, it’s a competitive differentiator versus the big players. Most of the big players’ sources are the web or self-service. We have information from other sources,” he says.

The listings, which contain information on many companies that do not have a web presence, are gathered from sources such as telephone records and from Yellow Pages’ travelling reps.

The records are used by Google, Yahoo and others under cross-licensing agreements, says Boisvert.

Read more on Master data management (MDM) and integration