The language of data integration

Integrating the data that supports electronic business is a huge challenge for even the most cool-headed ITprofessional. Danny...

Integrating the data that supports electronic business is a huge challenge for even the most cool-headed ITprofessional. Danny Bradbury looks at XML - and Microsoft's attempt to corral the new language

Data integration languages that promise to grease the wheels of electronic commerce are highly functional, but slippery to handle.

In the world of business-to-business e-commerce, companies need to ensure that they are able to exchange data between their servers in a meaningful way, so that as much of the e-commerce business process can be as automated as possible. Data integration languages are therefore going to be an important part of Internet-based software development history.

XML could be the answer to the problem of providing data integration. Since the World Wide Web Consortium (W3C) adopted it as a proposed standard, almost every supplier has started rolling out XML-compatible products.

The rationale behind the language is simple. Whereas standard HTML comes with a predefined set of tags designed to indicate how things should be displayed on a Web page, XML is designed to provide developers with a more flexible tagging system, so that they can define tags of their own that are appropriate for the type of data that they are exchanging. One example could be a tag that describes the temperature of a patient for exchanging data between medical applications.

XML is, therefore, not particularly useful for the exchange of data in its vanilla form. Rather, it can be used to create sub-languages that are specifically tailored for vertical market purposes. These sub-languages are normally referred to as document type definitions (DTDs).

Because XML is still immature, much of the work on defining standard DTDs for use in specialised environments is embryonic, so its take-up in the real world doesn't reflect the high profile it has achieved in the e-commerce community.

Andy Longshaw, formerly principal technologist at independent training company QA Training, is excited about XML. For a lot of people it heralds the end of conventional electronic data interchange (EDI) approaches to business-to-business e-commerce, he says. EDI has traditionally involved the expensive construction of fixed routes of communication and complex data exchange protocols.

Nevertheless, Longshaw isn't blind to the problems associated with XML. Speed is an issue, he says. "One company I know makes a lot of proprietary transactions, and it says that, performance wise, there is no competition," he warns, explaining that what XML developers gain over EDI in terms of flexibility, they lose in speed.

"When you're doing several transactions per second, you benefit from having a fixed route, but if you want to open that route up, XML is useful," adds Longshaw.

Part of the problem with XML is the potential increase in file size. XML messages give you the option to embed information which describes the message directly into the message itself - the data tells you how it should be interpreted. This is most useful for companies that are working without a mutual understanding of the DTD, but it is technically inefficient.

Embryonic markets were made to be moulded, and Microsoft has jumped at the opportunity to create a space for itself in this arena.

Microsoft's Biztalk is a set of XML standards for descri bing the format of data within particular industries. The idea is that companies in a given sector will be able to share XML-based information in a meaningful way.

Richard Hamblen, who manages the company's application development products in the UK, explains that Biztalk, announced back in spring 1999, is designed to take the concept of business-to-business Internet data integration one stage further. For one thing, Biztalk is designed to get companies working together on commonly accepted data structures. Biztalk also has the advantage that, rather than simply defining standards so that applications know what a piece of data is, Biztalk enables developers to create schemas that tell the applications what to do with it, he says.

"If all I have is a schema and I say that this is what the data fields will look like, I am not saying how this is used in the packaged solutions that I have," he explains. On the other hand, a Biztalk schema could tell an application that the data is designed to be used in a SAP R/3 environment, and that a certain data element corresponds to a specific stream in the SAP database.

The company has thrown standards development open to the Biztalk consortium, a collection of companies that are defining vertical market schemas for the technology. It has gained support from big players such as enterprise software company Baan.

So, will Microsoft end up dominating the world by layering Biztalk on top of XML, thereby turning the language into a system-specific set of extended functions? Could this be seen as the emergence of a Microsoft-specific XML - an MSXML, perhaps? The company is merely helping to build the market in the hope that it will be well-positioned to take the lion's share of server sales, says Longshaw. This is all very well, but it does not rule out the possibility of an attempt at domination of the e-commerce-oriented data integration market by Microsoft.

It all sounds like great news for Microsoft, but there are still some issues with Biztalk. Although Microsoft is trying to chivvy the market into accepting the standard by throwing it out to the market for general input, the company still only supports Windows NT. This could be a problem for companies with customers that run systems other than Windows NT, and it's a problem that Microsoft has faced with other technologies such as its DCOM standard, for which multi-platform initiatives have produced little fruit.

Hamblen asserts that it is possible for application and operating system vendors - or even individual customers - to develop interfaces between their software and the Biztalk server, but this appears to go against the established approach to IT - that application development should follow the path of least resistance.

If you want to use a language like XML to handle your business-to-business e-commerce application, then go ahead, but be aware of some of the other opportunities. The DTD that pertains to your niche area may not be developed yet, and the Biztalk server that promises deeper data integration with your software isn't shipping either.

Another option for companies focusing exclusively on business-to-business purchases over the Internet is to opt for a managed hub-like service where companies post their product data and suppliers browse and buy using a clearly defined set of mechanisms.

Companies such as Commercenet, the business-to-business e-commerce service are already being used by some very large multinationals to offer an open route to online bulk purchasing.

XML pros and cons


  • XML is the official standard, adopted by the World Wide Web Consortium (W3C)
  • Defines its own tags
  • High performance quality
  • Cheaper than conventional EDI


  • Not efficient for exchange of "vanilla" data
  • Not widely used
  • Slower
  • Technically inefficient

BizTalk pros and cons


  • Companies can work on common standards
  • Creates schemas to tell applications what to do
  • Is supported by major companies such as Baan
  • Can develop interfaces between software and a Biztalk server


  • Not commercially usable yet
  • Only supports Windows NT
  • Does not follow the established application development approach of "the path of least resistance"


  • XML adopted as proposed W3C standard
  • Could bring the end of conventional electronic data interchange
  • Developers can define their own XML tags to describe different types of data
  • Microsoft is attempting to exploit the lucrative market with its Biztalk technology
  • Be aware of other options

Read more on E-commerce technology