Getting wired: Google's biological future?

A service that uses real people to rank Web pages is helping academics find data online

A service that uses real people to rank Web pages is helping academics find data online

In the early days, the Internet was essentially a medium for communicating: the main two services that ran across it were e-mail and Usenet newsgroups. Although FTP (File Transfer Protocol) servers acted as a data repository, it was of a fairly opaque kind, and it was hard to find out much about the contents of an FTP site.

In the early 1990s, several new ways of accessing information were proposed, each of which adopted a different approach for rendering holdings more transparent. Gophers, for example, used a series of nested menus to create a hierarchical classification system, rather like a library. By progressively narrowing down the area, it was ultimately possible to find materials of interest. Wide area information servers (Wais), on the other hand, handled free-form materials by indexing them and allowing searches through those indexes.

The fact that both these technologies are unknown to most Internet users today indicates that they failed to catch on. In part, this was down to their own limitations, but mostly it was because something much better turned up around the same time: the World Wide Web.

The Web was easy to use and very powerful for creating free-form links between pages, but it soon suffered from its own success - the more pages there were, the harder it was to find anything. Two solutions were proposed. One - essentially a gopher for the Web - was the hierarchical Web directory as pioneered by Yahoo. The other - a kind of hyperlinked Wais - was the Web search engine.

Search engines went through various iterations before arriving at the method adopted by the current leader, Google. It describes its Pagerank system as follows: "Google interprets a link from page A to page B as a vote, by page A, for page B. But Google looks at more than the sheer volume of votes, or links a page receives; it also analyses the page that casts the vote. Votes cast by pages that are themselves 'important' weigh more heavily and help to make other pages 'important'."

The usefulness of Google is testimony to the general soundness of this approach, but an obvious variant would be to use real people to rank Web pages. This would clearly be impractical on a large scale, but working with a small number of acknowledged experts in a circumscribed domain is certainly feasible.

Proof is provided by the interesting Faculty of 1000. As its name suggests, this consists of about 1,000 specialists in various fields of biology. These are asked to evaluate and comment on two to four of the most interesting scientific papers that they read each month. They rate these papers as either "recommended", "must read" or "exceptional", and classify them into types such as "novel finding" or "controversial".

The consolidated results are then gathered for each subject area to produce the current top 10 papers, all-time top 10, most-viewed top 10, and hidden jewels top 10 - papers from less well-known journals - that subscribers can access and search through. The papers themselves remain on the site of their publisher, and may require additional subscriptions. How this all looks in practice can be judged from the tour.

Aside from the approach itself, what is interesting is that the Faculty of 1000 idea could be translated to any field, though it may be that leaders in the business sector, for example, would be less willing to participate in a scheme that requires a certain community spirit.

The company behind the Faculty of 1000, the Current Science Group, also produces Biomedcentral. This is notable for adopting a novel publishing model: articles can be read free of charge on the Internet, but the author or their institution pays a fee for the process of peer review and formatting manuscripts - turning the usual approach on its head

Read more on Operating systems software