Semantic web technology could help firms make better use of their
data and gain better search results, but don't expect to see it on
the public internet any time soon
One of the biggest problems with the web and with knowledge
management tools is that information is dumb. The data contained in
websites and knowledge management systems does not know what it is.
This makes web searches very difficult, turning up hundreds of
thousands of results that are completely irrelevant or only
partially related to your subject matter.
The semantic web project, initiated by Tim Berners-Lee, the creator
of the world wide web, has been designed to make web and knowledge
management data more intelligent. It works by encoding metadata
into information that helps to describe not only that information,
but its relationships with other pieces of data. In this way, you
augment traditional, hyperlinked connections with a new type of
semantic link. You create an invisible matrix in which information
is connected by meaning.
Semantic web links can be powerful when used in a commercial
context. If you operate in a vertical sector such as food
production, you may have thousands of pages on your intranet
detailing different aspects of your processes and products.
Searching through them could be difficult, but if you have
semantically encoded them, you may find it easier. Suddenly, you
will be able to start with a particular ingredient and ask the
browser to find all foods that use more than 10 milligrams of that
product, for example. Or you may start with a finished food item
and semantically browse the ingredients that constitute more than
5% of its overall make-up.
Although some of this work can be done in traditional relational
database management systems, such structures are rigid and not easy
to change, said Alfredo Morales, director of collaborative
healthcare at Boston-based medical software company Clinician
Support Technology. His software product, Baby CareLink, is a
knowledge base designed to advise and remind clinicians dealing
with premature births. It works by encoding information about each
child in a semantic format.
"Semantic technology lets us establish loosely coupled
relationships within the patient's information. Relational database
rules would have to be hard coded and they also require hard work
to maintain," he said. "Semantic technology lets the knowledge base
adapt as we learn more about what is important for each particular
baby.
Semantic information is encoded using an XML-based standard called
the Resource Description Format. RDF can encode relationships
between particular pieces of information.
For example, "John" could be described as "man" and linked to
"Mary" with the relationship "husband of". This sounds simple, but
the possible descriptions of different objects and their
relationships are limitless. Companies are getting around this by
developing vocabularies for particular subject areas. Called
ontologies, these vocabularies often focus on vertical markets
which have specific subjects and relationships. Another XML-based
language, called Owl, is used to create these ontologies.
Semantic encoding can be particularly useful in inference engines.
Encouraging relationships between pieces of information enables you
to analyse that information for new relationships. In our example,
"Mary" may have the relationship "daughter of" with "Eric". Now,
although it has not been explicitly encoded, we could infer that
"Eric" has the relationship "father-in-law of" with "John". When
dealing with rich sets of complex data, such capabilities can be
very useful.
Using such technologies within the corporate firewall is one thing,
but building a whole new web based on them is quite another. If we
could create a second generation web using semantic technology, the
benefits would be huge.
Companies such as Google, which has already made the best it can of
the web's unstructured content base, could make web searches much
more intelligent, returning results that would not only be more
relevant, but which could then be navigated by concept, rather than
by hyperlink. Imagine clicking on a piece of data and receiving a
list of web-based elements that it is related to, along with a
description of those relationships.
John Davies, manager off next generation web research at BT's
research division BT Exact, said we are a long way from creating a
semantic web. "Whether it will make the step to the external web,
the jury is out. It is unlikely that anyone will turn those five
billion pages into RDF any time soon," he said.
Another problem is that ontologies focus on specific areas, but the
web covers all areas of information. Consequently, we must bring
ontologies together. The Dublin Core Metadata Initiative has been
working since 1995 to develop an infrastructure to do just
that.
The semantic web is not likely to hit your browser any time soon,
but the semantic intranet just might. The underlying technology has
been on the agenda since the mid-to-late 1990s, but it is now
starting to move from theory into commercial products as companies
begin to release RDF-capable knowledge management systems and
inference engines. UK-based Inference Networks is one such firm,
and in the US, Amblit Technologies has a semantic browser, and
Intellidimension has an RDF data management system.
The key challenge lies not just in encoding your existing data with
RDF, but also in developing or finding an ontology that best suits
your business. Do so, and the rewards could be high as you begin to
discover all sorts of tacit information buried inside your
company's knowledge base.
DMCI http://dublincore.org/W3C semantic web
activity www.w3.org/2001/sw/Semantic Web Special Interest
Group http://business.semanticweb.org/Semantic web community
portal www.semanticweb.org/