Weissblick - Fotolia
Microsoft has a central artificial intelligence (AI) organisation – AI Research – where 5,000 people support AI for a range of products including Office 365 and Azure.
The group has projects covering Bing, Cortana, Cognitive Services, Microsoft Bot Framework, as well as supporting incubator projects.
Lili Cheng heads Microsoft’s Future Social Experiences (Fuse) Labs. Her team’s work feeds into the Bot Framework, Cognitive Services and Cortana. Fuse Labs is a multidisciplinary team, which collaborates with academia, startups, the art and design community, and other Microsoft teams.
Computer Weekly met up with Cheng at Microsoft Build 2017 in Seattle, where the ability to build skills in Cortana was one of the highlights on the first day of the developer conference.
Extending the application of artificial intelligence is core to Microsoft’s strategy to commercialise and productise AI in the future. Cheng says Microsoft has been developing custom AI models to support speech, language and vision applications. One example where customisation is necessary is in speech recognition.
“We have speech technology, which we make available to third parties. Often these speech models are not customised to your voice, or for the types of scenarios an individual developer may want to use for speech,” says Cheng.
“Custom speech enables you to customise speech to your accent or to understand certain terms. Often, when we build a business application, a company will use certain terms that don’t appear in a common dictionary.”
Cheng cites a recent example where Microsoft worked with the Singapore government to customise AI for digital services. The government wanted a way for citizens to ask natural language questions to find digital services in a more intuitive way.
“So many businesses are really interested in how they can engage with their customers. Microsoft is looking at making conversational AI more predictable for users and easier for developers to use,” she says.
“Language is at the core of everything we do. We need to put language at the core of our computing experience, otherwise we are constantly mapping our brain onto the structure of the computer”
Lili Cheng, Fuse Labs
For Cheng, there are two parts to using AI for chat. First, she says, it can be used to improve the understanding of what the user has typed in.
“In an email, you may want to schedule a meeting, and in most cases you would need to navigate to your calendar, then copy and paste the message, and find the addressee in the address book. It is actually really cumbersome. It takes more time than it should,” she says.
But with AI understanding what the user wants to do, “it could read the email you are writing, open up your calendar automatically, and it would also know the recipient, so the task would be a lot faster”.
A better user experience
A lot of AI in Microsoft concerns making using a computer more productive. With artificial intelligence, Cheng believes the industry gets a chance to rebuild the premise behind the very first human computer user interface – the command line.
“The command line had this promise of dialogue, but it was the most brutal dialogue ever because you had to know which eight letters to type in. But it answered back too, which I loved,’ she says. “In a way, AI reduces you all the way back to the beginning: can we do better at understanding what people input?”
Cheng believes one of the biggest impacts of AI is how it shapes the way people interact with computers: “We live with our computers and phones for more time than we are willing to admit. I feel that the systems are inherently very binary in the way they are designed. They are very rigid. You need to learn so much just to interact with the machine. You have to learn a set of rules, and we have probably mapped the way we live for 10 hours a day to this list of rules. People want a better baseline user experience, where the computer understands language and maybe understands speech.
“One of those magical moments with AI was watching my parents listen to Spotify through a speaker. They could have used a computer, but this would have been too hard for them. A lot of AI is about creating a great user experience.”
But she says it is very hard to create a great user experience. Making an internet search respond in natural language is incredibly difficult. Consider the example of a query that brings up seven results, six of which will be wrong for the user. “In conversation, if seven answers were wrong, that would be a really bad experience.”
The command line interface was the most obvious first step developers took when they designed a computer that people could interact with, although it was very restrictive. While the graphical user interface (GUI) made strides to make computers more usable, Cheng points out that language is at the core of everything we do.
“We need to put language at the core of our computing experience, otherwise we are constantly mapping our brain onto the structure of the computer,” she says.
Lili Cheng, Fuse Labs
Among the challenges for AI engineers is how to map certain techniques common in computer use, such as the browser “back” button, in a way that fits in with a natural language or speech user interface. “Human conversations flow forward in time; people never go back,” says Cheng.
Her team has been investigating the user experience in human conversation, which shows how engineers are tackling the concept of going back through a chat conversation. Almost every chat app has a time-based user interface, allowing users to scroll back to earlier parts of a conversation, and the app is aware she says.
Evolution of AI
Cheng joined Microsoft 20 years ago, and worked on applications including chat. The researchers expected chat to evolve to some form of machine intelligence and virtual reality, but then the internet took off, and search became the killer application. The usefulness of search is perhaps reaching its limits and AI is increasingly being used to deliver intelligent search.
According to Cheng, a year ago people did not talk of AI. But this has changed.
“They were afraid to use the term AI because the phrase holds so much promise,” she says. “We are a long way off from understanding the human mind. Maybe that’s a good thing because people are kind of special. But the experiences we are having with the way people converse and the way we handle ambiguity makes it a fun time to be in AI. There are so many advances in speech, language, hardware and software. All companies are investigating this space together.”