This year, Microsoft is almost certain to add Speech Application Language Tags (Salt) to Visual Studio .net and ADSP.net (Advanced Digital Signal Processor), which should make it much easier to implement voice response systems.
Last autumn, Microsoft, Intel and Cisco got together with Comverse, Philips and SpeechWorks to found the Salt Forum - you can download the 0.9 specification and a white paper from its Web site at www.saltforum.org/. The standard is royalty-free and platform-independent, but Microsoft also licensed SpeechWorks' technology.
The obvious application is a voice-enabled Web. The much more interesting idea is that it could do away with all the keys and buttons on mobile phones and handheld organisers.
Voice-activated dialling is just the start. Eventually, users will be able to pick up their phones and say things like, "Fetch Bill Smith's number from my home PC, then call it," and, "Ask Interflora to send my girlfriend a red rose."
The main driver for this sort of development could well be the motor industry. Darwinian selection will eliminate the kind of person who thinks it is a good idea to send text messages from a cellphone while doing 70mph down the M1, leaving behind only the people who use voice response systems that let them keep their hands on the wheel.
Another leap forward is already available. Some companies are using it now to delight their customers, and make a significant contribution to the advance of civilisation. They are installing voice response applications to replace touch-tone phone menu systems from hell (press 5 to go back to the last menu, press 6 to make a will, press 7 to send a death threat to the managing director). To be frank, I would like any corporate strategist who thinks touch-tone menus have a future to go play SMS roulette.
There are a couple of obvious caveats to Salt. One is that many people have spent the past year working on the Voice XML standard. Another is that the Salt Forum's founders unaccountably missed out the biggest voice response technology supplier, Nuance, and the third biggest, IBM.
However, it is far from clear whether Salt and Voice XML will compete against or complement one another. Although, as SpeechWorks' chief executive Stuart Patterson told me last week, the industry is still based on proprietary technologies, and it is better to have two standards than none.
The key to Salt is that it has been designed to be "multimodal". It has to work on the Web, on mobile phones, and with any kind of input device you can find, including keyboards and mice, both on embedded systems and over the network. That makes it an important part of everybody's future.
Jack Schofield is computer editor at the Guardian