Feature

Hidden complications of content management

Content management may seem like a simple concept, but there are hidden complications. Danny Bradbury reports

Many might argue that the Internet is the biggest step in the dissemination of content since the invention of the Gutenburg printing press. The truth is that the Internet is to content what DTP was to the church newsletter. Just as vicars went crazy designing ridiculous layouts for misspelled text and badly written articles, so companies put out content on their Internet sites without much thought for its quality.

A recent survey produced by NOP and sponsored by content management software vendor Mediasurface, for example, interviewed 104 companies to find out whether they thought there was any duplication of information across their Web sites. While 41% replied positively, a high percentage of companies believed that their Internet sites were important marketing tools, with 52% placing the greatest importance on the quality of content on the site.

Meanwhile, 47% of the respondents to the survey revealed that they had no content management or personalisation software in place, and of those that didn't, 75% said that they would like to publish content to the site directly.

Content management tools can be used to ensure that text is created and updated in a structured way, to check Web sites for bad links, and to ensure that all of your graphics, audio and video clips are stored properly for later retrieval. One of their biggest goals, other than ensuring the structure and quality of text on a site, is to ensure that content can be exchanged between different parties, found easily by visitors to a Web site, and also made as relevant as possible to those visitors. Consequently, many content management systems also deal with content personalisation. In spite of the proliferation of multimedia content on the Web, text is still the most popular form, and it is also one of the most difficult to manage properly, just as in the publishing industry, text can be misspelled, badly constructed, or just plain inaccurate.

Problems can emerge when text is created by a multitude of people from different departments, all of whom will have different writing styles and levels of literacy. Building editing processes traditionally associated with the publishing sector can help alleviate this problem, says Nick Gregory, vice president of marketing at content management software firm Mediasurface. "Digital ink never dries. It has to be constantly rotated. So you have to think about a lot of authors for content," he explains.

Many of the better content management systems will make this process non-technical, building workflow processes into the system so that business managers can enter information and then have it sent to an editor who collates and checks the information, for example. Such systems can then be used to assign rights to specific people. You may not want to let an author change content once it has been posted on the site, for instance.

Another step in the content management chain that many people ignore is localisation. English may be one of the more popular languages in the world, but there are plenty of potential customers speaking other tongues, and any good strategy for managing content on a global medium will enable you to reach them. Gregory works with translation companies that take content from one repository and translate the information for clients. Although machine translation (the automatic translation of text using computer software) is faster than manual translation, it can produce literal translations, producing the sort of text that you see in manuals for video recorders translated directly from Japanese, for example. Manual evaluation of translated content is vital, he insists.

Of course, not all of the information that you produce on your Web site or intranet may be your own. Syndication is big business, and companies such as Newsedge, a firm that has been aggregating content for 12 years, will sell you it.

"It's important inside a big corporation or bank to reduce the amount of information to the most relevant stories available on a particular subject," says Jon McNerney, senior international vice president at Newsedge. The company takes roughly 100,000 news articles a day and puts them into a repository, where it can filter duplicates and then deliver customised news to corporate customers based on their predefined profiles.

But as the amount of content grows, finding what you need on a Web site can be difficult. One way around it is to design your site to be intuitive. In May last year, the Word Wide Web Consortium (W3C) released the initial version of its Web Content Accessibility Guidelines recommendation. This specification created a clear set of expectations for the design of Web interfaces that would make the navigation of information simpler for users. Concepts in the specification include the use of style sheets to separate the look and feel of the content from the content itself, so that designers can define how content will appear on the page by changing a few central parameters, without having to make any changes to the actual content.

Search engines are another good way to keep your data accessible. Sometimes, when there is a huge amount of content on the site, end-users can find it daunting.

David Heath, European business manager for AltaVista Business Solutions, explains that his company is involved in selling the AltaVista search engine for corporate use. This gives people the ability to extract information from Internet and intranet sites, and also includes a developer kit so that companies can create interfaces from the search engine into their own Web applications.

This also raises the issue of dynamic content. Much information on Web sites is generated from databases, meaning that it isn't stored on static HTML pages within an organisation. It can be difficult to manipulate this content if it is only displayed in response to a query. As many people are now accessing such data through the use of Active Server Page and Java Server Page scripts, manipulation commands can be written using these scripting languages.

Heath explains that the AltaVista search engine can also access this data. It can be set up to perform regular queries on data, say every 15 minutes, so that when an end-user performs a content search, up-to-date database content is also included in the results.

Finally, the eXtensible Markup Language [XML] is a useful means of handling content. One thing that the language allows you to do is to define what the data in a document means, rather than simply how it should be displayed. Using languages written in XML, then, it is possible for developers to produce content that can be syndicated and searched on more easily, because a search engine can know what relevance the data has by reading the tags associated with it. If, for example, you are searching for news articles marked up using an XML-based language focusing on the news industry, end-users could specify that they want to search for news articles with the tags "politics" and "US". This will ensure that when they search for "George Bush" they don't get back articles about bush fires or bushels of wheat for example.

XML can also be used to ensure that your content is displayed in different ways depending on what sort of medium it is viewed on, such as a PC browser, PDA or Wap phone, and browsers can also be used to interpret instructions within XML code to display content in a particular way. An example is drawing a chart based on a set of numbers stored in an XML-related format.

Managing content isn't simply a case of ensuring that it's spelt correctly and then slapping it up onto a Web page, then. Rather, it entails careful attention to localisation, accuracy, and searchability, along with some innovation in the area of content display, and the handling of dynamic information.

If you can invest some cash in the right tools and skills for the job, you will help to make your Web site a frequent point of return for visitors, hopefully increasing your sales into the bargain.

Content Management Players

  • Allaire produces Spectra, a content management system with workflow and process automation capabilities.

  • Mediasurface announced version 3.5 of its content management product, which includes a Java API so that it can be hooked into other applications.

  • Catalyst Solutions is selling Aptrix, a content management tool from Australian company Presence Online. Aptrix provides content personalisation facilities, along with 'write once, publish many' facilities enabling content to be accessed from different devices, like Wap phones.

  • Chrystal Software sells Eclipse, a content management system that provides layout and design capabilities, along with personalisation.

  • Arbortext sells Epic, a content management system that lets companies produce and manage personalised content, translating files from XML into formats including PDF, WML (for WAP phones) and Open Electronic Book (OEB) format for ebooks.

    Top Tips for Content Management

  • Separate your content from your presentation data using style sheets

  • Use XML to describe your content in more detail

  • Set up a workflow process so that key personnel can edit content

  • Capitalise on your content strategy by creating an information portal focusing on your chosen subject, thereby building your site's reputation

  • Follow the W3C's content accessibility guidelines

  • Include some form of translation in your content strategy to extend your business to foreign language speakers

  • Make sure that you include dynamic content in your strategy - it's arguably your most valuable content asset

  • Make your content searchable so that users can find things more easily


  • Email Alerts

    Register now to receive ComputerWeekly.com IT-related news, guides and more, delivered to your inbox.
    By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

    This was first published in December 2000

     

    COMMENTS powered by Disqus  //  Commenting policy