Microsoft is promising speech recognition for the masses with the launch of its Speech Server products at the SpeechTek conference in San Francisco next week.
The launch marks Microsoft's entry into the server-based speech recognition market where it will compete with the likes of Nuance Communications, ScanSoft and IBM.
"Our goal is to make speech recognition technologies mainstream," said James Mastan, director of marketing for the Microsoft's Speech Server group.
The pitch is simple. Developers can add speech capabilities to existing web applications based on Microsoft's ASP application framework by adding code based on XML and Speech Application Language Tags (Salt) technologies using Visual Studio .net.
Speech Server takes calls and communicates with the web server through XML and Salt and makes applications offered online available through the phone.
Speech Server runs on Windows Server 2003. The Enterprise Edition needs to run on a separate physical server while Standard Edition, designed for small and medium-sized installations, can be placed on the same hardware as the web server. Microsoft will recommend configurations and resellers will offer fully configured systems.
Mastan believed users will like Speech Server because it is familiar. Developers can use Visual Studio and it runs just like any other Microsoft server product.
"It is not some black box in a call centre that you have to programme for in some weird language and you can't maintain yourself because you don't know how it works," he said.
Yankee Group and Gartner analysts said Microsoft has to prove itself in the market and users need to be aware that creating a speech recognition system is more complex than Microsoft makes it sound in its marketing messages.
"Speech applications and a voice user interface are pretty tricky to do. That may well get lost in the first version of the Microsoft marketing hype that will go out there," said Steve Cramoysan, a principal analyst at Gartner.
"If you're going to use Microsoft Speech Server, use professional services people who know exactly what they're doing."
Yankee Group senior analyst Art Schoeller issued the same warning to potential Speech Server users in a research note last year.
"It is dangerous to imply that any web developer will speech-enable applications, because not all have proper training in the best practices for dialogue design," he wrote.
Still, Microsoft's entry into the speech recognition market is a significant event, Cramoysan admitted. "Microsoft will certainly shake up this market, but I think we're going to be looking at the second and third version of this product when they will become much more competitive than with this first release of the product."
Nuance, named by Mastan as Microsoft's chief rival, agrees with the analysts and goes a step further. "Microsoft is developing an inexpensive and easy way for developers to design really bad applications," said Kevin Chatow, principal product manager at Nuance. Adding speech to web applications may not result in usable applications, he continued.
While Microsoft may like to position Nuance's product as obscure, Chatow pointed out that Nuance supports VoiceXML 2.0, a recognised standard, and not Salt, which is still making its way through the standards process. Furthermore, the Nuance product is not tied to Microsoft technologies, but also works with Java application servers.
Pricing for Microsoft's Speech Server products will be "an order of magnitude lower" than competing products, and details will be announced next week. Yankee's Schoeller predicted Microsoft will undercut the competition by about 30%.
Microsoft will offer free 180-day trial versions of its Speech Server software, which will, initially, only be available in US English. General availability of the software is expected to be a few weeks after launch.
Joris Evers writes for IDG News Service