Writing the future

Brian Clegg assesses the state of handwriting recognition technology

Brian Clegg assesses the state of handwriting recognition technology

Getting information into a computer is a laborious business. Although the penetration of PCs into schools and homes has produced a new generation more comfortable with keyboards, there are many in business, particularly at senior management levels, who hate using them.

Probably the most widely touted alternative is speech recognition technology, but this is limited. Imagine an open-plan office with all the inhabitants attempting to speak to their PCs. In addition, dictation is rarely a natural process, and speech, even more so than using a keyboard, is a clumsy way of editing text. The obvious solution is handwriting. It is quick, quiet and ideal for annotation and correction.

In practice, getting a computer to recognise handwriting has proved difficult. But the big guns are working on it. Last summer, Bill Gates said that high-quality handwriting recognition would be part of the mainstream before speech recognition. He predicted that it would hit the market in "two or three years' time". This message was reinforced at last year's Comdex Fall, where Gates introduced the world to Microsoft's Tablet PC.

While Gates' predictions should always be taken with a pinch of salt (this is the man who said the Internet would not amount to much), he is clearly taking handwriting recognition seriously.

Microsoft's rivals do not intend to be left behind, either. Apple announced that it would ship handwriting recognition alongside this year's Apple operating system version OS X, although the Inkwell product disappeared shortly before OS X was released.

Apple has had an on-off attitude to handwriting recognition, and the company was largely responsible for the decline of the technology during the 1990s.

Early handwriting products, dating back 10 years or more, were little more than prototypes rushed into production, but Apple, under the guidance of John Sculley, went much further. Its Newton Messagepad promised usable recognition of normal handwriting. This was a great aim, but the technology let down the promise.

The Newton's Russian-written software was notorious for guessing the wrong word, a fact that was emphasised when a series of Doonesbury cartoons was created around the bizarre messages that it produced.

Newton's handwriting recognition software was also slow - in the early versions it could be a couple of sentences behind the writer. This made it difficult to keep an eye on what was being rendered.

The quality of recognition did improve with later versions, but the damage was done. A third-party developer brought out an alternative product, Graffiti, and while the Newton version never caught on, Graffiti was to become the de facto standard of the late 1990s.

The main reason for Graffiti's success was that it was bundled with millions of Palm Pilot PDAs. It uses an unnatural approach, but overcomes many technical problems. Text is input one character at a time, each character written in the same, single box, while the text that has been interpreted appears in a separate application.

To keep recognition simple, each letter is given a unique shape that is formed in a single stroke without removing the stylus from the screen. This results in some strange formations - A, for example, is an upside-down V, while K has the diagonal strokes joined in a loop and no downstroke. As is common with recognition systems, Graffiti also has special characters to deal with common requirements such as a tab or a new line.

There can be no doubt that Graffiti (and similar products for Windows CE such as Jot) is a fudge. It is designed to suit the system, not the user. While it is not bad on the small screens of palmtop devices, the inability to write directly into an application and the need to enter single special characters is restrictive.

The future of handwriting recognition has to be the capability to handle ordinary, or cursive, handwriting. But this is a much more complex process than recognising specially defined characters.

Cursive recognition typically starts by cleaning up the input - removing content that doesn't contribute information. This includes correcting slants, making sure that words are appropriately aligned and removing the ascender and descender strokes that make one letter interfere with an adjacent one. Words can then be distinguished by breaking the text down into characters.

This process is often helped by using a dictionary to isolate legal letter combinations. But recognition rarely involves straightforward matching of handwritten characters to letter shapes - instead, preprocessing transforms the characters into a superset of symbols with attached properties (for example curvatures and line lengths), which are then used as the basis for matching.

The level of intelligence and sheer processing power needed to interpret cursive handwriting has always been underestimated. It may be that Microsoft's Tablet has got it right - but don't hold your breath.

Security applications
Handwriting recognition by the human eye is the basis for the most commonly used security identifier - the signature - appended to millions of cheques, card receipts and documents each day. But using handwriting recognition as an electronic security measure requires more than visual pattern recognition.

Anyone can learn the visual aspects of someone else's signature well enough to fool the casual eye or scanner - but security recognisers take note of the way we accelerate and decelerate, or apply differing amounts of pressure while signing. No forger can duplicate all of these aspects.

It was thought for some time that this would become the standard for electronic security, as the signature is already an accepted form, has none of the criminal associations of the fingerprint, and lacks the threatening nature of retinal scans. Unfortunately, the devices to handle signature recognition have proved expensive and prone to damage. As trial have shown that the public does not mind using fingerprint identification, this is now more likely to become a standard.

Taking the Tablet: Microsoft's vision
Bill Gates' keynote speech at Comdex Fall focused on the fact that computers need to be easier to use. The highlight of his presentation was a demonstration of the Tablet PC, a prototype handwriting recognition-based device that is scheduled to go into production in 2002.

Microsoft does not have much history of selling hardware, but perhaps this PC built around an input device, a 4cm-thick, clipboard-sized, write-on screen, will provide the opening that the software giant desires.

The Tablet features orange and silver styling that is a hybrid of Apple's iBook and Sony's elegant Vaio. It was demonstrated at Comdex running Whistler, an alpha version of Windows XP. It has a 10Gbyte hard disc, 128Mbytes of memory and supports wireless networking, but the significant advance is "digital ink", Microsoft's handwriting software.

Unlike other products, the Tablet can manipulate handwritten text while keeping it in a handwritten form. While it is possible to convert to the typed form, you can also select, cut and paste, make bold and italic, and search text that remains handwritten.

The make or break factor for the Tablet will be the quality of the recognition, but this ability to treat handwritten text as manipulable data is a rare piece of innovation from Microsoft.

Read more on Business applications