A guide to developing taxonomies for effective data management

To make the search and browse capabilities of content, document or records management systems truly functional, we need to develop taxonomies.

The story goes that if Microsoft had made completion of the properties box of all Office documents mandatory there would be no need for document management systems. But as the politicians say, "we are where we are", so we need to develop taxonomies - a set of chosen terms used to retrieve on-line content - to make the search...

and browse capabilities of the content, document or records management systems truly functional.

Be it a taxonomy designed for storage and management or one that supports better search, without them all types of management system are near useless, regardless of the platform. But many organisations are not willing to commit the proper resources in their design, so millions of pounds are spent on management technologies without investment in the appropriate categorisation needed to organise them.

Organising information

Responsibility for the placement and tagging of organisational information has shifted from a small group of information professionals, such as the much-mourned corporate librarians, to a wider pool of content managers.

The driver for change is the retirement of those whose careers saw the end of the secretary/typist where the mantra was 'information is power so let's keep it secret - even from my boss'. With the X and Y generations has come the emergence of people who accumulate on-line information willy-nilly. While this should lead to information being accessible and easy to share, the result is often mismanaged content with no tagging.

Business taxonomies have now gained credibility as the missing link in information management projects. From the Greek, taxis, meaning 'order' and 'arrangement', taxonomies use taxonomic units to classify and arrange in a hierarchical structure otherwise random objects. For example, a car is a subtype of vehicle, so a car is a vehicle but not every vehicle is a car.

A business taxonomy should be the primary storage design for an enterprise's content. Organising content in the same way supports the interoperability of systems. The benefit of having structured and unstructured information sources able to relate to common topics is limitless.

Increasing usability

By organising all corporate data in a single way, the overall usability of knowledge systems increases considerably. No longer will staff need to learn the methodology behind one system, only to find a different one for the next tool. Importantly, business users get to sift through information available to them to find what they need and avoid the duplication of effort that bedevils much of corporate life.

A business taxonomy forces system designers to classify metadata fields to content categories - for example, department, location, topic, document type, etc. A list of metadata values is then defined to populate each field in-line with the taxonomy. This restraint is critical to making system designers adhere to a strategic vision rather than one of their own.

Using a consistent taxonomy for content storage helps an enterprise understand the information it holds as well as that which is missing. By referencing like information within a single schema it will be able to use related information that was previously divided into separate areas of management - this is particularly important, though more complex, when working with multi-disciplinary teams.

Supporting business growth

A business taxonomy has the potential for an even greater impact on the effective retrieval of content, or discoverability by users. Users are divided into 'browsers' who like to click through a structure to find what they are after, or 'searchers' who prefer search terms. Taxonomies serve both, providing users with multiple routes to the same information.

If business taxonomies are considered at the outset of information management projects, a foundation can be set that will allow organisations to expand and evolve their designs. Benefits in storage and management, findability, and interoperability will grow over time. A disorganised system will be prone to stagnation, have limited user adoption and dissolve into chaos.

The price of implementing a simple business taxonomy pales in comparison to the cost of failure of a project lacking one. The potential benefits are substantial and returns immediate. No organisation can afford to overlook this critical piece of its information management plans.



Steps to successful taxonomy design 


Roles and responsibilities

  • Governance board - define strategy and the appropriate type of content
  • Taxonomy team - ensure the value of content placement and metadata (a minimum of six and maximum of 12 members)
  • Content managers - approve and edit content
  • Content owners - publish content and apply metadata

Understand your content

  • More content means more time to re-tag
  • Clean out old or obsolete content
  • Every item has one correct categorisation
  • Items may be organised into multiple categories
  • Minimise number of 'clicks'
  • Allow flexibility and redundancy
  • Strive for topical taxonomy

Before getting started understand your:

  • Audience
  • Publishers
  • Platform
  • Content
  • Limitations

Getting started

  • Keep it broad, shallow, simple and elegant
  • Six to 12 top-level categories.
  • Two or three levels deep.
  • Focus mainly on the primary, top-level concepts
  • Be inspired by existing schemes. eg industry standards and local practices




Tools for taxonomies and classifying documents into categories:

Read more on Data quality management and governance