Drop the dead cyberdonkey

She's the sedentary Lara Croft - real-time news, virtually, from a virtual person. Danny Bradbury reports

She's the sedentary Lara Croft - real-time news, virtually, from a virtual person. Danny Bradbury reports

PA Newswire put an artificial person online last week, with the launch of Ananova, a computer-generated newsreader designed to deliver personal bulletins over the Web.

Computer-generated cartoon characters are not a new thing - visitors to online chat rooms have been using symbolic figures called avatars for the past few years. But this is one of the first times that such a character has been designed with a voice and facial expressions of its own.

The virtual newsreader, who made her debut on 19 April, reads news articles created by journalists, but also relies on a real-time news search engine that provides the basis for the text-based news that PA subsidiary Ananova currently provides to portal customers.

How Ananova speaks

The service brings to mind the old Max Headroom TV series of the mid-1980s. It works using a combination of online news feed processing, text-to-speech technology from computerised speech and voice recognition specialist Lernout & Hauspie, and a digitally animated character created by specialist multimedia house Digital Animations.

News articles are coded into an XML-based format that includes tags specifically designed to influence the tone of the character's voice, and her facial expression.

Jonathan Jowitt, project manager for the Ananova character, explains, "We had to develop a technique where traditional text wouldn't sound flat and boring. We had to take that initial level of conversation and animate it, putting in facial and body movement. Reporters enter data into the XML template."

The XML-based script is parsed into the Realspeak text-to-speech engine that Lernout & Hauspie configured for the company. This produces the initial pass for the sound file that will be used as the character's voice. It also contains timing information which dictate how the phonemes (the small pieces of speech that make up a statement) will be timed to sound natural.

Ananova uses the American English version of the text-to-speech engine, which also determines the prosody (melody) of the sentence, so that the intonation sounds human. The sound is created by choosing phonemes from a large database of sounds, produced by a large number of recordings from a real person.

Patrick Salenbien, account manager at Lernout & Hauspie, explains that the software produces a set of markers describing which phonemes it will use, enabling third-party software developers to synchronise their own programs with the speech output. This output, accessed through an API, enables Digital Animations to synchronise the face of the character with the spoken words.

Facial expressions

Mike Hambly, chief executive of Digital Animations, says the company created the graphical character to be scalable in its complexity, so that the movement of the face could be made more or less realistic depending on the amount of processing power available.

Many computer-animated models in the past have worked by simply dislocating the jaw from the rest of the face, making it move independently of the rest of the face. Digital Animations engineered the model with a virtual set of 'muscles' that are animated by the output from the L&H Realspeak engine, so all parts of the face move accordingly, he says.

Jowitt says the system works by processing the speech and the image synchronisation at the back end, and then rendering it in a streaming protocol. It will initially be delivered in Realplayer format, but others may become available later.

At present, Jowitt estimates that the hardware can process a formatted news article into speech and animation in about three times as long as it takes for the character to speak the story, so there is still a slight lag between the article being entered into the system and its announcement. "Within a very short time we will be able to speed up so that it only takes as long as it would to read the story," he adds. This will turn it into a truly real-time service.

Where next

The company is providing localised servers around the world to maintain the quality of the multimedia signal by avoiding network congestion.

Concentrating on server processing reduces the processing power and memory footprint necessary to run the service on a client device. This strategy is designed to keep the company's options open. It wants to license the character so that it can make different announcements in specific settings. It anticipates that once the technology develops, mobile phone operators will make the character accessible to cellular users. This could make online mobile phone services much more consumer-friendly.

"She could also pop up in your car to say that the road you are driving down is blocked, and to ask if you would like to be told a different route," says Jowitt. On its Web site, Ananova speculates about radio alarm clocks that are able to display the character. She could wake people up early if news arose that may affect their day, such as congested traffic, for example. Any licence fees produced by the character will generate a cut for Digital Animations.

The Press Association is selling off the Ananova subsidiary, claiming that its focus on consumer-oriented markets is incompatible with the parent firm's business-to-business focus. PA launched its business-to-business Web site on the same day as the Ananova announcement, and you can find it at www.pressassociation.press.net. Although it remains cagey about the status of the sale, sources indicate that a number of buyers have been shortlisted. Ananova made just over £4m last year from its conventional Web news feeds.

More information about Ananova.

The face that launched a thousand chips

  • XML - used to provide Ananova with information about the news stories she is reading so that she can adapt the tone of her voice and facial expressions appropriately

  • Text-to-speech - Lernout & Hauspie's Realspeak text-to-speech engine uses an XML script to time Ananova's speech patterns to make it sound more natural

  • Animation - programming interfaces within Realspeak are used by the bespoke animation engine to model Ananova's facial expressions based on the words she is speaking

    What Ananova will offer

    News bulletins updated every five minutes.

  • Personalised news updates based on your own profile

  • Updates on transactions (for example, reporting when spare theatre tickets become available), and automatic fulfilment of those transactions

  • Web searching to find information about subjects that you specify

  • Sports information including statistics, scores and gossip.

    Updates on shopping opportunities

  • Information on UK-based entertainment events such as cinema, comedy and music, based on a personal profile. This will also include TV listing information

  • Human-interest stories

  • Read more on Data centre hardware