US data warehousing to make the most of Web data

All the information in the world is useless unless you sort the nuggets from the dirt. Data warehousing offers that possibility

All the information in the world is useless unless you sort the nuggets from the dirt. Data warehousing offers that possibility

It used to be called data warehousing, data marting, data mining or, even longer ago, information warehousing. Currently it's being merged into the more fashionable customer relationship management (CRM) arena as the prime method of collecting customer data, which can then be managed all the more satisfactorily.

Perhaps the sexiest name for it is business intelligence.

Business intelligence interrogates the shedloads of information trapped by the acres of computerisation that carpet the corporate organisation. It examines every tiny particle of data that sinks in through every input channel, whether it be a barcode scanner, a website registration or a direct feed from another computer. Astutely analysed, the company can discover such useful things as who and what its most profitable customers and products are, whether its supply chain is running at peak efficiency and much, much more. Revealed, such information can then be fed into the decision making process - sometimes even in real time - in order to steer the corporate ship more tightly into the wind of profitable endeavour.

What holds for business in general holds for e-business too. No intelligence and you're flying blind, piloting on gut instinct alone, looking to bomb a target you hope is there, while evading enemy pursuit.

E-business intelligence is a no-brainer and of all the corporate data sources that need to be exploited, your e-commerce website should be top of the list.

The reasons are self-evident. If a retailer's traditional EPOS (electronic point of sale) systems are the catchment area for watching how stock moves out of the shop, the website is even more powerful. It isn't just the point of sale, it's the entire shop. On a website, the retailer can monitor every single step a customer - or potential customer - makes.

The amount of raw data available on customer behaviour via the Web is formidable. Click stream analysis reveals exact data on where they came from, when they arrived, how long they spent, what they did, when they left and where they went.

"Data is cleaner on the Web," says Dale Vine, senior analyst with Bloor Research. By contrast, "legacy systems are a problem for data warehouses, full of old Cobol with minimal validation."

As well as being a whole lot cleaner, Web data is also likely to be a lot more copious - partly because of the level of detailed capture that clickstream analysis provides, but also because a single website can be a global check out - the customer catchment area is potentially the population of the world. This means that scalability is as vital an issue for the Web warehouse as it is for the operational running of the site.

"The scalability issue means that the data warehouse database must be able to cope with a very large amount of data and a very large number of customers," points out Chris Ward, marketing manager for Business Intelligence at Oracle. "You also need highly scalable hardware and networking capability."

It isn't surprising, therefore, that a market in outsourcing Web warehouses is growing alongside the market for hosting websites. Parallel processor WhiteCross, for example, has launched an application service provision (ASP) capability, WX/ASP, which analyses the data of customers such as online coupon provider and the UK's leading internet service provider, Freeserve.

But is Web warehousing - outsourced or not - happening at all yet?

Not much. "Websites are not yet data warehoused as a matter of course," says Carl Ward, director of E-business at KPMG.

"It's on the verge of happening," believes Mary Hope, senior analyst at Ovum.

However, it isn't because of the lack of tools. "All the technology exists," says Vine. "It has come on a great deal in the last two years and now it's a lot more slick."

Data warehousing Web data is not fundamentally different from data warehousing from any other source, and the tools have been maturing for some time.

Increasingly the data warehouse suppliers are targeting their offerings at the online market, with companies like Oracle starting to package suites of products aimed at both building and warehousing websites, and traditional data warehousing stalwarts like NCR launching its Teradata @ctive Warehouse at the Web, already used, for example, by the likes of US retailer Macy's and travel site, each of which have warehouses in the terabyte region. A terabyte is the equivalent of around 500 million sheets of typed A4 paper.

But if the technology for Web warehousing is there, why is there no mass take-up yet? "Everyone is too busy getting their site up and running," says Gary Cooper, research manager at the Butler Group.

If anything, argues, Oracle's Ward, it is the established clicks-and-mortar companies who are getting a grip on data warehousing sooner, rather than the start up dotcoms who, perhaps, have still to get a handle on the bigger picture of customer relationship management.

"The dotcoms are collecting massive amounts of data and they don't know what to do with it," he says. Without business intelligence techniques, they risk losing the customers they win, and not consolidating on the brands they are building.

But if e-business is still primarily focused on getting started - launching, promoting and operating new Web sites - it won't be for very much longer. Soon, the pilot phase of e-business will be over, and the serious business of making money will have to start. And that is where business intelligence comes in.

You've got the data but what do you do with it?

  • Use customer behaviour analysis to improve the post-launch design of the site iteratively - i.e. as an aid to e-business operations

  • Use customer analysis as market research to monitor and drive e-business strategy, from the tactical (eg, how well did that last promotion do? how many people hit the latest banner ad?) to the strategic (what is our changing customer profile of Web visitors and what do we sell to them?)

  • Combine Web data with data from all other sales channels in order to provide an integrated bottom up, top down understanding of how the company operates in its total marketplace

    EBay shows the way

    The world's largest person-to-person on-line trading community, eBay, uses Informatica's newly-launched PowerCentre.e software to consolidate the large volume of customer, demographic and click-through data generated from its website to fuel its customer relationship management efforts.

    PowerCentre.e is an expanded version of Informatica's PowerCentre data integration software with new features, such as Web-log extraction, to enable e-business analysis. PowerCentre.e integrates the huge volumes of Web transactions and click stream data with data from more traditional sources such as enterprise resource planning systems (ERP), relational databases, mainframe systems and external demographic databases, thereby helping to consolidate corporate data deriving from multiple sales, supplier and customer-interaction channels.

    Considered one of the company's top initiatives, eBay's e-business analysis system will help the company maintain its competitive edge by increasing the success of its online sellers and improving the buying experience of bidders.

    Macy's data dreams

    US retailer Macy's was one of the first traditional department stores to set up a separate subsidiary dedicated to internet commerce. The website mirrors the department store in merchandise presentation and product offerings. The US company, which has over 400 department stores and over $173bn in annual sales, has used NCR's Teradata-based retail Decisions Intelligent E-Commerce offering as the basis of its one terabyte website data warehouse which is hosted on an NCR WorldMark server. NCR helped Macy's define its business requirements for the new data warehouse, as well as build and manage it.

    Macy's are using the warehouse to target four areas of key concern to it in its e-commerce venture: measuring the profitability and effectiveness of banner advertising; analysing customer interactions and routes through the website; improving fulfilment capabilities and correlating online sales with store sales to cross-sell and up-sell customers across channels.

    List of suppliers

    Identifying a business-intelligence software supplier with a track record in Web-based datawarehousing is difficult. Nevertheless, it was clear that the e-word was the most popular aspect of last month's Business Intelligence 2000 show and conference.

    Here is a brief listing of some business intelligence companies that are clearly targeting the Web as a new data source to capture and analyse.

  • NCR Teradata -

  • Informatica -

  • SAS -

  • Informix -

  • Wipro Cybermine -

  • Contemporary -

  • Oracle -

  • Whitecross -

    How do you do data warehousing?

    Data warehousing was invented two decades ago by IBM as a mainframe application. Since then, the tools to do data warehousing, data marting, data mining or, as it is increasingly known generically, business intelligence, have been diversifying and maturing, both in terms of the range of suppliers and the range of tools.

    The basic components are: data extraction tools to acquire the data and a database - either relational or more specialised online analytical processing (OLAP) or multidimensional (MDDB) - to store the data software to analyse the data.

    The latter is increasingly being provided prepacked for various vertical sectors, such as sales, or financial, which massively cuts down the amount of work that users have to do. Since the entry of Microsoft into the market a year ago, costs of data warehousing have been falling dramatically, although both costs and time to production will depend on the amount of data being sourced and the complexity of analysis being run on it.

    Since so much of the focus of data warehousing is on providing the raw data to analyse the way customers relate to the company, many of the established data warehouse suppliers, such as Hyperion and Cognos, are aligning themselves with the more fashionable buzzword, customer relationship management (CRM).

    Business intelligence - the key facts

  • Data warehousing your e-commerce business is essential in order to understand your performance and customer behaviour

  • Website personalisation will be an essential key to winning and retaining customers who are only a click away from your rival - data warehousing is the key to website personalisation

  • Data warehouses that capture, analyse and respond to customer behaviour in real time will become an essential competitive differentiator in the fast moving e-business world

  • Most companies are still too busy building, launching and running their websites to get going with data warehousing them yet

  • The website is just another corporate data source, but the data it provides is cleaner, more coherent and comprehensive

  • Web data volumes can be massive. The website data warehouse must be able to cope with these heavy volumes

  • Business intelligence suppliers are only just beginning to target e-commerce as a new market. Many of them are coming in under the new guise of customer relationship management (CRM)

  • Just as companies are outsourcing the hosting and running of their websites, the market for outsourcing the Web data warehouse is opening up as well

  • E-businesses will need to think carefully about data protection issues regarding data collection and selling on

  • Estimating a quantifiable return on investment (ROI) for Web warehouses is very difficult

  • Expect to spend at least a quarter of your Web-build costs on setting up the Web warehouse

  • Read more on Web software