Storage is simple by association

The new economy has stretched relational databases to their limit, but a revolutionary new database model that is both simple and...

The new economy has stretched relational databases to their limit, but a revolutionary new database model that is both simple and suitable for diverse markets solves the problem. Eric Doyle reports.

As we enter the age of knowledge management the hairline cracks in the relational model for databases are starting to widen.

The concept devised and championed by Edgar "Ted" Codd through the 1970s served well for 20 years but over the past decade the concept of objects has stretched the relational database to breaking point.

At the moment it is still holding up, buttressed by object-oriented extensions, but there will come a time when the knowledge engineers decide that the foundations need replacing. Perhaps that time has been reached, says Lazy Software's chief executive Simon Williams.

Williams argues that describing today's databases as relational is a misnoma. Relational conjures up an image of a complex web of intelligence whereas, in reality, the database is a mere grid of simple vertical and horizontal relationships.

Read vertically, you get a list of customers. Read horizontally, you get the basic details of that customer. Any relational intelligence, such as which customers live in London, lies in the applications targeting the database.

Williams has a different view on data storage that is far more "relational" than the relational database but, since the optimum word has already been misappropriated, he has called his concept the associative model.

The associative model for data is based on a descriptive process. For example, to say that someone is your uncle merely describes a relationship between you and them. The listener would have to enquire more deeply to find out whether the uncle was your father's or your mother's brother - or if the person was just a long-established family friend who had adopted the title. To more fully describe this person requires several statements:

  • He is called Alex

  • He is called James

  • He is called Norman

  • George is brother to James

  • James is father to Norman

  • George is uncle to Norman

    These associations are expressed in a simple subject-verb-object syntax. This basic structure explains why Williams has called Lazy Software's database Sentences. The only modification is to refer to the structure as source-verb-target.

    The other essential elements that spring from the simple sentence structure is the concept of entity and associations.

    Alex, James and Norman are entities because they have a discrete, independent existence. Their relationships are associations because if one of the entities dies the association must change to past tense to reflect the change in the association. In other words, entities are persistent and associations are dependent.

    The difference between the associative and the relational database is the concept of association. In a relational database, all items are entities and any association is implied and imposed by the applications addressing the database.

    In the associative model, there are two tables. One is a simple list of a set of items (entities, verbs, objects) each given a unique identification number. The second is a four-column table of links. Each row or link has an identification number, source, verb and target.

    To illustrate this, Williams gives the example of how a simple piece of information would be stored in Sentences.

    The information is "Flight XY1234 arrives at London Heathrow on 12 December 2000 at 10.25am". This would be translated as four things: Flight BA1234, London Heathrow, 12 Dec 2000, 10.25am. In addition there are three "verbs": arrives at, on, at. These are stored with identifiers:


    Identifier Name
    77 Flight XY1234
    08 London Heathrow
    32 12 Dec 2000
    48 10.25am
    12 arrives at
    67 On
    09 At

    Using this data, one piece of information is stored in each links row: "Flight XY1234 (77) arrives at (12) London Heathrow (08)" becomes:

    Indentifier Source Verb Target
    74 77 12 08

    This can be used to build the next row which adds "on (67) 12 December 2000 (32)

    Indentifier Source Verb Target
    03 74 67 32

    Adding a third line for the "at 10.25am" statement, the full table in Sentences format becomes:

    Indentifier Source Verb Target
    74 77 12 08
    03 74 67 32
    64 03 09 48

    It is clear that the original sentence could easily be reconstructed recursively by a program using the line with identifier 64. This can apply to metadata just as much as to data, so complex transaction processes can be handled using multiple entity lists. Although the identifiers given above are numerical, a list of books for sale could be drawn up using Book1, Book2 and so on for the identifier. "This makes it easier for the programmer," says Williams. "SQL is a difficult language to use but a reasonably knowledgeable end-user could write applications in Sentences."

    Apart from describing a firm's business processes, Sentences has potential for describing multimedia objects. For example, an entity list could be created which would merely consist of objects and adjectives. A picture would be described in terms of the identifier number for each entity giving a greater degree of detail. If someone was searching for red sports cars, they would not only be given pictures in which the car was a main subject but could also find examples where a similar car just happened to be passing when a picture was taken.

    In ways like this, the database can be very economical in its use of storage space because of the way it splits entities from relationships. For example, a single entity may be a customer in one transaction but a supplier in another. In the typical relational database this would require two entries: one in the "customer" column and another in the "supplier". With Sentences the role of the entity depends on the context given by the links table so only one entry is necessary.

    Sentences is written in Java and runs as a servlet on a Web server and as a Java applet in a browser at the client end. This means that it is potentially portable across any platform that can run Java Virtual Machine but at the moment the company only supports Windows and Linux.

    The browser basis is important to Lazy Software's strategy because it is effectively in competition with Oracle, IBM and Microsoft - as if it did not have enough mountains to climb in just convincing everyone of Sentences' professed benefits.

    Williams realises that he must steer clear of a head-on clash and try to create a niche market as a bridgehead. "We are positioning the database as an ideal format for Web sites. It can happily co-exist with relational databases but has the advantage of an ease of programming that avoids SQL and offers a greater flexibility and reusability when new applications need to be written," he says.

    This is a crucial point. Relational schemas often become quite unwieldy and complex because of the way the technology has developed to handle new requirements, such as object storage, but mainly because they have been built and extended for some time. Even experienced programmers feel the frustrations of having to develop new SQL applications from the ground up, says Williams.

    As a start-up, Lazy Software is not lacking in expertise. Williams and the other two founders, Simon Haigh and Melinda Horton, were the driving force behind Synon, which produced a development environment for the AS/400 and the Obsydian application development tool.

    In a technology audit for the Butler Group, analyst Michael Thompson writes, "Lazy Software has the unenviable task of bringing to market a completely revolutionary idea. If that was not enough of a challenge, it is also doing this with a product that can fit into several markets - something that can often blur the marketing message."

    The company name is derived from the saying, "If you want a good mathematician choose a lazy one. Not one who's too lazy to do anything but one who's just lazy enough to find a simpler way of doing things."

    The beauty of the Associative Model is its simplicity but therein could be the seeds of its own destruction. Thompson says, "While Lazy Software is taking Sentences to market at the enterprise level, the major factor that will need to be overcome is the expected resistance that something so 'simple' cannot create business-critical applications.

  • Read more on IT for small and medium-sized enterprises (SME)