An important bit of the business data infrastructure is
now being decided in the "blogosphere" - that amorphous cloud of
interlinked personal weblogs that mainly seems to consist either of
geeks or teenagers writing about their personal lives.
At the moment, a blogger composes a few paragraphs of text and
posts it to a personal web page. This same text is usually
available to people who never visit the blog, but read a syndicated
version of the text in some sort of feed reader. The syndication is
done via RSS (Really Simple Syndication) or Atom, and users usually
subscribe to the feed by clicking an orange XML icon on the
site.
Three things turn this from trivial to important. First, instead
of carrying unstructured text, an RSS or Atom feed could just as
easily carry structured data. Second, this is still the case even
if you don't have either a blog or an RSS feed reader. Third, feed
reading capability is already built in to many browsers and is now
being plumbed in to Microsoft Windows Vista.
So just forget the blogging, and think about using an RSS/Atom
feed to transfer structured data from a product database to a
shopping search engine, such as Google's Froogle. Or events
listings into a local directory, or job vacancies to a recruitment
site, or order information to a customer.
It is perfectly possible to do that today, of course, but sites
often require data in their own formats. With a standards-based
approach, you could feed the same data to 10 or to a 100 million
sites, or simply ping them to come and get it.
The blogosphere is already working along these lines with
developments such as XHTML-based microformats (as used by
Technorati) and XML Schema-based structured blogging (a technology
backed by PubSub). In December, at the Syndicate conference in San
Francisco, PubSub launched the Structured Blogging Initiative to
push the idea forward.
Most blogs are, and will remain, unstructured, of course, but
most bloggers still have uses for structured data.
Examples include personal information (name, e-mail address),
events (time, date, place, topic), reviews (author, title, format,
actors), items wanted or for sale (name, type, description, price),
and so on.
The obvious point is that pretty much all of the blogosphere's
data formats are also important to businesses, and businesses will
have to work with them. However, since most businesses are barely
aware of blogging, and know little or nothing about either
microformats or structured blogging, they will have missed the
chance to participate.
In many cases it won't matter. For example, the hCard
microformat for personal information is based on the vCard "virtual
business card" format already used by Microsoft Outlook and other
address books.
In some cases, though, it may matter a lot. And it may matter
soon. You or a member of your family may already have read a review
that uses the hReview microformat, since examples include the
review of Harry Potter and the Goblet of Fire at
www.yahoo.co.uk.
Jack Schofield is computer editor at The
Guardian