Tagging along

A few text-based tags can go a long way. XML is little more than a smarter, more flexible language that looks similar to HTML,...

A few text-based tags can go a long way. XML is little more than a smarter, more flexible language that looks similar to HTML, but it offers huge potential benefits. So why are so few people using it?

It's not common knowledge, but God speaks XML. No, really, don't laugh. He switched to using this W3C approved communications protocol shortly after it first came out, because He was fed up with taking prayers in proprietary formats. Oh alright, he doesn't - but with vendor claims about XML becoming more superlative every week in the bid for more revenues, you could be forgiven for believing it.

XML has its roots in the standard generalised markup language (otherwise known by its sexier name, ISO 8879:1986). When XML 1.0 was voted in as a Recommendation by the World Wide Web Consortium (W3C) in February 1998, it set the scene for a frenzied vendor marketing-fest. The question is, is it as important as vendors seem to think? The answer is yes - but the vendors are ruining it for themselves.

From an e-commerce perspective, the XML standard carries huge potential, both in the business-to-business and the business-to-consumer markets. Just as a hyper text markup language (HTML) document uses instructions in simple text-based 'tags' to define how information on a Web page will look, XML can be used to encode information using the same system of tags. The difference is that unlike HTML, XML can store any instructions, rather than only page layout information.

In short, it's possible to extend the basic types of data that an XML document can define into different specialist areas. This means, for example, that a shipping company could produce an XML document with information about shipments and the document could contain tags explaining what the information meant and what to do with it.

This use of extensible data definitions is potentially useful for companies that want to exchange data between themselves. For the past couple of years, the growth of Internet-based business and the huge amounts of legacy data sitting inside companies' IT systems have created tension between the e-business hype and the more complex reality. Getting businesses to translate their back-end data into formats that other businesses can understand has been a difficult task, not least because with so many different applications and data formats in use, it's prohibitively expensive to try to convert your data into each one.

Just as Esperanto was proposed as the answer to the human language problem, so XML is the proffered answer to the data integration challenge. The idea is to translate your legacy data into XML. It is then sent to your business partners, before being translated into their own applications' data formats at the point of reception. At least this way, you only have to translate your data into one language.

Unfortunately, there's a catch, which is why XML is about as widely used in computing as Esperanto is in Europe. The problem is that while many people believe XML to be the holy grail for data integration, it's actually useless on its own. XML is a meta-language - a language designed for the creation of other languages describing specific forms of data.

There are various XML-based standards on the Internet, and they broadly break down into two areas. The first is vertical applications, focusing on specific industries and document types, while the second targets more horizontal documents and processes that apply to many different business sectors.

A good example of an industry-specific XML language is HR-XML, the human resources industry's protocol for time and expenses data reporting, which enables companies to exchange electronic data on employee hours. Another is adXML, an XML-based language for the advertising industry, which enables companies to exchange information about advertising inventory and costs.

Meanwhile, horizontally-focused standards for trading independently of any particular sector are also emerging. The Business Application Software Developers' Association (BASDA) has been busily working on eBIS-XML, a set of schemas for exchanging standard business documents such as invoices and purchase orders.

While this standard is a good idea, it lacks impetus. Chair Dennis Keeling admits that of 370 member companies, only 20 have been certified. "You can take a horse to water but you can't make it drink. That's why we had to refocus to look at specialist markets," he says. The organisation (which has already practically rewritten the standard once) has now set up specialist groups producing extensions to the schema for specific industries, which will doubtless place it in competition with existing vertical market XML languages.

Another set of XML-based documents for horizontal business transactions is the boleroXML standard, which is designed for companies engaging in international trade. It sits within the Bolero system, which is a framework enabling messages to be sent securely.

Other standards work at higher levels, providing XML-based frameworks containing rules for passing XML-based trading documents and encoding information about the preferences of the companies' customers using them. The one that has received the most press is ebXML, which was jointly formed by the United Nations Centre for Trade Facilitation and Electronic Business and the Organisation for the Advancement of Structured Information Standards. You can find out about it at www.ebxml.org. Meanwhile, Microsoft has sponsored an architecture called the BizTalk Framework that focuses on the structure of XML documents, imposing more criteria on companies building languages. It will ultimately enable them to exchange more complex data, including information on how to process the data in the documents, says the firm.

Confused? Andy McBurnie, strategic technology director at B2B e-business software vendor Tranmit, isn't surprised. "We have 60 customers in the UK and the vast majority are still using CSV files [simple text files produced by saving an Excel file] to exchange data," he says. He warns that the plethora of different frameworks and languages is bogging the market down. Companies are too afraid to go with any XML language, especially when many of them will have to redevelop their existing e-business systems to cope with it.

But one area where XML could take off is in the delivery of data and services to end-users, rather than other businesses. Matt Stiles, UK technology manager at international IT consultancy Cambridge Technology Partners, explains that with the number of different end-user devices proliferating, using XML style sheet transformations (XSLTs), you can filter a single set of data into formats appropriate for different displays.

XML may be the answer to everyone's prayers, but like most religions, you shouldn't expect enlightenment from it immediately. The market for the technology needs to mature, and some of the industry dogma needs to abate before companies will feel comfortable using it. We know a lot of customers that will say amen to that.


In a nutshell
  • XML is useless on its own - you have to build languages on top of it

  • There are so many languages that customers are getting confused and scared off

  • It can be used effectively for business-to-consumer commerce, to deliver different information in different formats from the same central data store

Case Study: Reuters
Reuters has been using XML in its information delivery business for the past couple of years. It developed NewsML, an XML document format for encoding information about news articles. It now uses it in its NewsML showcase. One distinct advantage, according to Mark Hunt, director of XML strategy, is automatic linking to relevant information using XML tags. "We can embed a lot more contextual information that might be useful to the user. That could be personalised information based on your profile," he says. It's also possible to create dynamic links to relevant articles from a NewsML-encoded story on the fly. "HTML is passive - in that language I would need to manually build a Web page that has those options on it."

Read more on Business applications