An important bit of the business data infrastructure is now being decided in the "blogosphere" - that amorphous cloud of interlinked personal weblogs that mainly seems to consist either of geeks or teenagers writing about their personal lives.
At the moment, a blogger composes a few paragraphs of text and posts it to a personal web page. This same text is usually available to people who never visit the blog, but read a syndicated version of the text in some sort of feed reader. The syndication is done via RSS (Really Simple Syndication) or Atom, and users usually subscribe to the feed by clicking an orange XML icon on the site.
Three things turn this from trivial to important. First, instead of carrying unstructured text, an RSS or Atom feed could just as easily carry structured data. Second, this is still the case even if you don't have either a blog or an RSS feed reader. Third, feed reading capability is already built in to many browsers and is now being plumbed in to Microsoft Windows Vista.
So just forget the blogging, and think about using an RSS/Atom feed to transfer structured data from a product database to a shopping search engine, such as Google's Froogle. Or events listings into a local directory, or job vacancies to a recruitment site, or order information to a customer.
It is perfectly possible to do that today, of course, but sites often require data in their own formats. With a standards-based approach, you could feed the same data to 10 or to a 100 million sites, or simply ping them to come and get it.
The blogosphere is already working along these lines with developments such as XHTML-based microformats (as used by Technorati) and XML Schema-based structured blogging (a technology backed by PubSub). In December, at the Syndicate conference in San Francisco, PubSub launched the Structured Blogging Initiative to push the idea forward.
Most blogs are, and will remain, unstructured, of course, but most bloggers still have uses for structured data.
Examples include personal information (name, e-mail address), events (time, date, place, topic), reviews (author, title, format, actors), items wanted or for sale (name, type, description, price), and so on.
The obvious point is that pretty much all of the blogosphere's data formats are also important to businesses, and businesses will have to work with them. However, since most businesses are barely aware of blogging, and know little or nothing about either microformats or structured blogging, they will have missed the chance to participate.
In many cases it won't matter. For example, the hCard microformat for personal information is based on the vCard "virtual business card" format already used by Microsoft Outlook and other address books.
In some cases, though, it may matter a lot. And it may matter soon. You or a member of your family may already have read a review that uses the hReview microformat, since examples include the review of Harry Potter and the Goblet of Fire at www.yahoo.co.uk.
Jack Schofield is computer editor at The Guardian
This was first published in January 2006