Web links lead to dead-ends

High levels of disconnectivity in the Web mean that e-businesses must adopt linking strategies. Helen Beckett reports

High levels of disconnectivity in the Web mean that e-businesses must adopt linking strategies. Helen Beckett reports

Companies spend hours poring over their e-business strategies to ensure their Web site is as visible to the target audience as possible. That means spending millions on a catchy name and advertising, but once a site is out on the big wide Web, it is reliant on search engines, and strong links between pages, to attract business to the site.

But according to a new theory developed by scientists and engineers at IBM, Compaq and Altavista, 10% of all pages on the Web exist in isolation, and just one-third make up the central core of strongly linked nodes. The remaining 60% of pages either can be reached from the central cluster but offer no way back (such as corporate Web sites with internal links only) or lead only to pages in the central core with no links from them.

The bow tie theory as it has been dubbed has far-reaching implications for companies trying to reach out to end-users, be they in the consumer or business environment. It also calls into question the efficiency of many of the search engines currently trawling the Internet.

"If a search engine wants to increase the size of its index," explained Andrew Tomkins, a member of the IBM Almaden research staff, “simply adding more bandwidth and machine power will not take it beyond the central core of the bow tie."

Research staff at search engines such as Altavista have already seized on the information, and are using it to pioneer new ways of trawling the Web in pursuit of the comparative pricing data demanded by many e-businesses and consumers.

Clive Featherstone, Internet expert with ISP Thus, and an executive member of the ISP Association, said: "A lot of people naively think that if you start anywhere, you can get anywhere. In fact, that is only true of a quarter of the bow tie and only a quarter of the Web."

While the research is unlikely to hold any surprises for technologists, it is a clarion call for Web developers and marketeers to rethink the relationship between their site and the rest of the Web. It drives home the message that if a company wants to be seen, it must put itself at the heart of the central cluster - even if that means advertising a rival. While marketing departments may have thought that a successful Web strategy called for a combination of advertising, reliance on search engines and ultimately surfers bookmarking a favourite site, the bow tie theory suggests otherwise.

"People might stick your site onto a bookmark, but the Web is never going to guide them," said Robin Bloor, CEO of Bloor Research. He likens the Web to a road network: "If you want people to drive to your Web site, you'd better make sure it links to the motorway - those sites which are highly travelled and well connected."

The most successful businesses, he added, are those that implement connection strategies: "No-one is going to strangle traffic onto competitive sites. It's better to have cross-links."

Other commentators have welcomed the research but want more regular crawls of the Internet to show whether the bow tie could be mapped to geographic dispersal of links, for example. Stefan Silverman, master technologist with Scient, said: "It confirms popular wisdom and places a matrix upon it However, he pointed out that the research data represents a static slice of time. He called for further analysis to be undertaken to show which pages were being retrieved.

The Almaden team says the bow tie theory is just a first step and plan further research to refine what it admits is a primitive model. Among further investigations planned are whether any of the regions of the bow tie would break into similar fractal structures - lots of miniature bow ties - and whether tightly connected pages link to frequently visited topics.

Whatever the results of later research, the team's early work has proved that companies need to rethink some of the well-worn beliefs on users' surfing habits if they are to really produce a winning e-business strategy.

Unravelling the bow tie

Producing the bow tie theory was a massive undertaking. Three crawls through 500 million Web pages in total mapped pages as nodes and hyperlinks as arcs. This allowed the research team in Almaden, Calilfornia, to construct a graph revealing paths between pages. The connected Web was found to be divided into four parts:

  • the central, highly connected core or knot of the bow tie, where pages have hyperlinks to *and* from other pages in the core

  • one section adjacent to the core (one loop of the bow), where pages have hyperlinks into the core pages but not from them

  • another section adjacent to the core (the other loop of the bow), where pages have links from the core pages but not into them

  • the tendrils (the straps of the bow tie) abutting the loop sections, where pages have links either into pages that have no link into core pages or from pages that have no link from core pages

    The concern for those reliant on an e-business strategy is that10% of all Web pages are not part of the bow tie at all and are described by the research team as islands. If your company's pages fall into this category, its Internet strategy is likely to fail.

  • Read more on IT legislation and regulation