The search for meaning

Everyone knows that there's masses of valuable information lurking in the enterprise. So all that's needed is a user-friendly way to find the right stuff and put it to good use.

Everyone knows that there's masses of valuable information lurking in the enterprise. So all that's needed is a user-friendly way to find the right stuff and put it to good use.

Just don't call it googling the enterprise. That is the advice to those researching the state of today's enterprise search capabilities, from Mike Davis, senior analyst at research firm Ovum. According to Davis, US search engine Google does not appreciate its name being turned into that kind of generic verb.

But the problem with Google is precisely that its massive success in internet search has made it a familiar name. Everyone knows what Google does. So why not do it inside the enterprise, as well as on the world wide web?

The simple answer is that enterprise search is more complex than it first appears. Specialised enterprise search companies, such as Cambridge-based Autonomy and Norwegian firm Fast, have been grappling with the complexities of enterprise search for many years.

The difference now is that Google, Microsoft and Yahoo have all entered the enterprise search market. "These tools are relatively immature, but still far in advance of what most people have with basic Windows," says Eddie Short, vice-president of technology services at Capgemini. "The key thing is that they have legitimised the market."

The benefits of enterprise search are significant, according to Short, who says users can reap measurable savings, especially if search is coupled with a more structured approach to storage, using a corporate filing plan, classification schema and metadata tagging of documents.

But there are major challenges in enterprise search. "On the internet, you want everyone to see everything," says Mike Lynch, CEO at Autonomy. However, inside the corporate world, information is far more sensitive.

Based on Autonomy's discussions with its customers, Lynch estimates that the average employee sees one in 10,000 documents. "If you want someone to access, let's say, 100 documents, you have to work out the access rights to prevent them seeing all those thousands and thousands of documents you do not want them to see," he says.

Then there is the technical challenge provided by the fact that corporate information is kept in many different places, in many different formats, and changes constantly.

"An enterprise search system has to search, typically, 350 different sources, work out access rights and get the information on screen, all within a few milliseconds," says Lynch. "And if the organisation is global, it may have to do all that in different languages."

Another challenge is that Google works on a popularity basis: if enough people type in the keyword Madonna and then click on links associated with the singer, Google will bring up those pages more often.

"In a corporate environment there are no links, so you cannot use popularity in that way, and in fact popularity is a bad thing in that environment," says Lynch.

"If you are the company's single expert on a specific subject and you need to find information, it is all about what you need, not about what is most popular."

So enterprise search is a very different animal from internet search. But Google has had a powerful impact. "Because Google has been such a pervasive experience on the internet, the expectation of being able to search for information is now moving into the corporate world, and that user experience of just jumping in and getting results is very desirable," says Don Campbell, vice-president of platform strategy and technology at business intelligence specialist Cognos.

Companies like Cognos already supply sophisticated search facilities for users of business intelligence systems, and they are now seeing demand to extend those capabilities more widely across the enterprise. "Users are trying to leverage their investment in business intelligence," says Campbell.

"The industry standard is that 15% to 20% of staff use business intelligence systems, but there is no reason why 100% of staff could not get value from these systems, and the search interface is important to unlock that value."

Unstructured data is the biggest challenge for all corporate search engines. "Over the past five years there has been an explosion in unstructured data - Word documents, e-mail and so on," says Lynch.

"A lot of business know-how is in the unstructured data, so that is what started this rapid growth in demand for search tools."

Another major driver has been the need to comply with increasingly onerous information regulatory frameworks. "There are massive risks. Something could be buried in that data that could cost you millions of pounds or get you sent to jail. So it is literally a matter of survival," says Lynch.

Campbell agrees, particularly when it comes to the security of corporate data. "There is no option to be lax on security. You cannot afford to have an interface with someone searching on keywords where even knowing the results exists could reveal sensitive information. That is a technical challenge that has to be taken on."

Things are changing rapidly in the world of enterprise search. "Google has woken the major suppliers up," says Davis. This is particularly true at the lower end of the market, where small and medium-sized companies are keen to have their own, competitively priced search facilities.

Autonomy's acquisition of Verity was one sign of this, and Microsoft has also been active in this area, announcing several new enterprise search products, including Windows Search Preview and Microsoft Office Sharepoint Server 2007 for Search. "Microsoft Office 2007 contains specific software for search," says Davis.

Meanwhile, Google has been working on its enterprise search capabilities. This is nothing new - Google has had enterprise search software available since 2003. Earlier this year, the company released Google Onebox, an enterprise search appliance. "Google's pricing model is very competitive," says Davis.

"Although it does not yet have the widest range of connectivity, like Autonomy and Fast, it is probably only a matter of time. Even Oracle, which has released its own enterprise search product, has said it will work with Google. It is probably unstoppable. And the business benefit is the user interface."

Davis believes large enterprises with challenging search requirements will continue to require specialised search capabilities.

But growing competition between existing search specialists and mainstream internet search players could be good news for companies still wondering how to get at all that knowledge in their growing mountain of unstructured data.

Case study: insurance group's search ends in better service to clients

"Before, we were not able to get at our information at all. Now, we have full access to all our information. There has been a direct benefit to clients." That is the value that Sherene Robson, general manager at insurance firm Cardif Pinnacle, places on the firm's corporate search capabilities.

Cardif Pinnacle, part of global banking group BNP Paribas, is a provider of creditor, warranty and special risks insurance, with a turnover of £727m. The company, which employs 750 staff, needed to provide a way for its users to access all of its corporate information.

This was a challenge. The company had already streamlined its IT infrastructure, consolidating 30 disparate databases into six, but it still needed to provide its users with better access to management information.

"Consolidation had made management information a pain point," says Robson. One of the big challenges at the firm was the number of reports being generated - more than 600 reports from the core business systems alone - and a lack of proper documentation and indexing.

Cardif Pinnacle runs its business on Progress software systems and also uses the CedarOpenAccounts financial suite. When the company needed to add enterprise search capabilities it opted to install EasyAsk's search and information retrieval front end.

This was deployed on top of Aruna Companion, which stores information in a query data set, and overcame the issue of databases needing to predefine how records are linked to each other.

Instead of having tables with pre-designed indexes, each piece of information is automatically indexed when loaded into the Aruna query database.

This has been combined with EasyAsk's natural language search capabilities to provide a very fast search engine.

One of the most frequently queried tables has 36 million transactions, for instance, but Robson says the performance is "outstanding".

Using a natural language query engine means users can make complex queries. A typical query could be: how much premium was received in 2005 for all people named Smith who lived in Kent and what is their average age?

EasyAsk queries Aruna and the answer is saved as an Excel spreadsheet. The system is used by about 140 users within the company, including the finance department, which is a heavy user, and about 25 analysts in the underwriting and actuarial department, who rely on the system to be able to tap into past performance information when creating new products and systems for clients.

Learning the system is easy, says Robson. "We can train our users to feel reasonably comfortable with this system in about half an hour. It really is as easy as a Google search."

Robson adds that Cardif Pinnacle looked at alternative deployments. "Taking into account the software, consultancy and all the time needed internally, the level of investment we have made in this system is a real fraction of what it could have been," she says.

"We were taking some risk when we installed the system, as the software was fairly new at the time and we only had three or four people rolling it out to the whole organisation. But it has been a great achievement. We are really proud of what we have done. When we went into it, we did not realise the full benefits. We simply needed access to our information," Robson says.

"That is very difficult to quantify, but it is things like searches taking 10 minutes, rather than five days. We have done some comparisons, and something that could take six hours on the Progress database takes two to three seconds using this system, so that speaks for itself."

Read more on Business applications