Ever tried to fit an elephant into an eggcup? Or read a newspaper printed on the back of a matchbox? Probably not, because they simply don't fit - just like, or so say some analysts, relational databases and the new world of mobile computing.
In the IT world, challenging the supremacy of the relational model is little short of heresy. Relational databases are everywhere, underpinning corporate applications, even holding data on the desktop. Based on solid mathematical principles, the relational model is now well understood, well proven and rather well worn.
"We need to take a new look at how we hold data," argues Simon Williams, chief executive of Lazy Software. "We either have to accept that the relational model goes on forever or start wondering what will supersede it."
A few years ago, the object database was cast as the potential challenger to its relational cousin. As it turned out, that was like putting Woody Allen into the ring with Mike Tyson. Object technology may have been convincing intellectually but it just wasn't powerful enough for commercial applications, and was soon out of the running for all but a few niche applications.
Lately, though, a new rival to the relational database has appeared. Variously referred to as associative, semi-structured and tree-structured database technology, it replaces the rigid rows and columns of relational products with a looser tree and branch structure of associations between data items. And its use of Extensible Markup Language (XML) tags to describe the database content makes it possible to create a far more flexible data structure for the Internet era. The question for corporate IT departments will be how well this new approach to information handling fits into an enterprise data strategy.
With relational databases so well established you might wonder why there should be any need for an alternative. The reason lies in the growth of distributed data. Both the Internet and the trend towards mobile computing mean that a large central data repository - the thing that relational technology does best - may no longer be the best answer to the needs of mobile and remote users.
"What's happening with the Web is that people want to acquire data from multiple sources and integrate it," says Luca Cardelli of Microsoft Research in Cambridge. "The trouble is, there's no common schema that lets you take data from, say, your mail address book and integrate it with another data source."
The data in a relational database has no meaning outside its relational schema. The tree and branch structure of associative databases, however, makes it easy - in theory at least - to separate off sections of data without losing the overall structure.
The structure of associative databases can be defined in various ways. But because they lend themselves to Internet and mobile applications, these new-wave databases have a natural affinity with the XML standard being developed by the World Wide Web Consortium, which provides a powerful way of attaching standard presentation and processing rules to data.
XML works by embedding tags in a text document that supply metadata defining what should be done with the data they enclose. For example, XML tags might specify that a certain set of data is to be displayed as row of spreadsheet items or a three-dimensional model, or a particular field in an EDI form.
Two important principles of XML are platform independence and the separation of content and processing rules. In the case of a customer database, one type of tag might indicate a name field, another an address, another a phone number, and so on. If the tags were standard - admittedly always a big if in the IT world - then it would become much easier to mix and match data from a variety of sources. It would also be easier to split off a subset of your main database and carry it around on a mobile device, since data carries its XML processing information with it wherever it goes.
Associative databases are not the first technology to try and take on the relational model. But where object databases were mainly the product of niche companies specialising in object-oriented technology, associative databases are being backed by companies with solid experience of the corporate IT environment, and with experience of building applications to meet commercial demands.
Last autumn, for example, veteran database builder Software AG unveiled the Tamino data store - successor to its long established Adabas database. Adabas databases were running heavyweight online transaction processing applications way back in the days when relational technology was still new, risky and slow.
The good news for corporate IT departments is that you won't necessarily have to make major changes to take advantage of what new-wave databases have to offer. Just as the major relational database suppliers announced object-relational extensions to enable their products to store complex objects within a relational structure, Oracle, IBM and others have already announced XML facilities within their relational databases.
"We provide a series of XML roadmaps to your data, which means users don't have to concern themselves with where data is coming from," says Gary Pugh, database product marketing manager at Oracle. "Oracle 8i has a series of adapters and transformers that can take a relational file and turn it into XML. It enables you to have content parsed as XML then re-rendered in whichever form a device wants to display it." And for mobile users, Oracle has a 'lite' version of its relational database, designed to to run even on PDA devices.
Similarly, Microsoft is building XML support into Access, and IBM's DB2 version 7 includes XML support so users can define, store and retrieve XML-based documents.
But Software AG has gone much further than this with Tamino, an associative database which works natively with XML. "Two and a half years ago, we decided we needed a new, highly modern database technology geared to Internet and e-business," explains Peter Mossack of SAG. "When XML appeared we saw its huge potential as a basis for all new e-business, and decided to base the architecture of our new system wholeheartedly on that concept, with its advantages of completely integrated text retrieval, transactionality, and so on."
Why native XML? SAG argues that bolting XML support onto a relational structure will inevitably lead to inferior performance. The relational data server has to read the XML tags, transform the data into relational tables and then use SQL syntax to store the results. To retrieve the document, the same process has to be carried out in reverse. Native XML support avoids the need for internal data transformation, which, SAG claims, translates into better response time, greater system availability and improved scalability.
Acknowledging the continuing importance of relational storage, however, SAG has ensured Tamino will provide traditional SQL-based relational data storage, plus links to other major database systems such as Oracle, DB2 and its own Adabas.
But not everyone is convinced the Internet era needs new database technology. "In my view, Software AG wanted a new-generation database to replace Adabas, and it latched onto XML in a technology-driven way, with no clue about the problem it was trying to address," says Tony Percy, VP of strategic planning at Mercator Software. Percy spent 10 years with Gartner Group researching database and middleware issues before joining Mercator. While acknowledging XML's strengths as a data interchange format for the Internet, he criticises the "obsessive" use of a document-based approach to solve a whole range of IT problems - including database management.
Percy argues that the key problem for most database applications is synchronising transactions rather than content. He cites the example of an airline boarding card, which will contain some static content, such as the traveller's name, destination and flight number, but also some dynamically changing information such as departure time. The boarding card is just a snapshot of that dynamic information at a moment in time; to get an accurate picture, you need some way of tracking the processes that generate the data.
Chris Harris-Jones, senior consultant at Ovum, also sees performance hazards ahead for XML databases. "When you mark up content in XML, it will increase in size quite dramatically, which may or may not be a problem," he says. "The theory is that it won't matter because network bandwidth is increasing at the same time, but my view is that it will have an impact anyway."
Recognising the problem of increased bandwidth, SAG has incorporated powerful compression algorithms into Tamino to cut files down to size. But Harris-Jones wonders how the packing and unpacking of data at each end of the transaction will affect XML's portability and platform independence. "You can use compression to compensate, but then you have to ask yourself how that will work if you're trying to send the same piece of content to a WAP phone, a Palm Pilot, and a conventional browser, all from a single common format," he says His conclusion is that XML data structures will be effective for small amounts of text or unstructured data, but not for large data volumes. Another key problem for XML-based data structures, whether used in a relational or associative database, is the perennial problem of standards.
For XML to work as a way of structuring data, the systems at each end of the communication path must be using the same tag definitions. That's easy to ensure within a company, where the organisation has total control over tag definition. Take it out into the wider market, however, and everyone has to be working with the same set of tag definitions. That will mean a mammoth task for the World Wide Web Consortium, which is trying to coordinate the work of the various organisations involved in XML.
The good news, however, is that for once there seems to be a real desire in the IT industry to line up behind XML standards, with even Microsoft - not famous for its willingness to conform to other people's standards - toeing the line. As Cardelli puts it: "This is a lot bigger than even Microsoft."
EDI and the impact of XML
The Electronic Data Interchange (EDI) standard is well established in some industry sectors as a way of exchanging commercial documents and forms electronically. But because EDI is expensive to implement, it has mainly been taken up by larger organisations.
Until now. By providing a standard syntax that gives meaning and structure to electronic documents, XML could provide a more cost-effective standard framework for exchanging data over the Internet than do expensive value-added networks. In doing so, it will open up EDI to a wider range of trading partners and be a powerful tool for automating the supply chain.
Work is in progress on XML/EDI standards that are backwards-compatible with existing EDI transactions, while adding new functionality for the future.
XML/EDI documents provide a structure that contains data about a transaction, and instructions on how the transaction should be processed. Users can define workflow rules for routeing the document and triggering events. For example, the document will be able to route itself to an order processing system using its built-in search, classifying and routeing mechanisms. It will have built-in transaction status that users or applications can set or interrogate, and will be able to inform users that's it's one in a linked set of documents used in the workflow, for instance.
Because XML provides presentation instructions separately from the actual data, it will also be possible to present a document in different ways for different users: for example, as a human-readable printed form, as a Web page, and as electronic data for an application to process.
However, XML/EDI standards are still being defined. A European XML/EDI pilot project is now under way, as part of the EU's Information Society initiative.