In the early days of virtual personal assistants, the goal was to create a versatile digital companion that was always there, ready to take on any task. Today, tech companies are realizing that doing it all is too much, and are instead doubling down on what they know best.
For Google, that means allowing Google Assistant to take over the tasks you could ask a real personal assistant to do if you were too busy with work. At its I/O Developer Conference this week, the company outlined plans to boost Google Assistant’s ability to do most of the work of renting a car, and last year demonstrated that it was making robocalls on behalf of users. Meanwhile, at its Build conference in Seattle this week, Microsoft made it clear that it’s approaching the assistant role from a different angle. Since the company has a deep understanding of how organizations work, Microsoft focuses on managing your workday with voice, reorganizing meetings, and turning dials on the juggernaut of bureaucracy in concert with your phone. .
“What excites me is stepping back and thinking about the promise of natural language systems,” says Dan Klein, a tech at Microsoft who co-founded Semantic Machines, a natural language processing company acquired by Microsoft. . Last year. “It’s not being able to press a button with your voice. That’s great; but the real promise of a natural language system is being able to do a wide range of things with a uniform interface that is natural to you, it’s faster than the alternative.
If Microsoft or Google can deliver on that promise, their virtual assistants won’t just be trendy add-ons for users who want to set alarms or move calendar invites by speaking out loud. Voice is the next major platform, and being the first to access it is an opportunity to make the category as popular as Apple-made touchscreens. To dominate even one aspect of voice technology requires tapping into the next iteration of how humans use computers.
Cortana’s work prowess
Just as the smartphone made touch a popular, if not the most popular, way to interact with software, big tech companies see voice as a similar revolution. It has the potential to be faster and more intuitive, and also a convenient alternative to spending our lives staring at screens. With minimal setup, you can talk to your phone or laptop like you would a person, and be completely unaware that you’re replacing one computer with another.
But a true all-purpose virtual assistant is difficult because AI today only works in narrow domains. You might be able to teach it to answer questions related to coffee by collecting data about coffee and training an algorithm to extract answers from that data, but to do this for all you would have to compile data about each known subject, verify that all of this is true, and update this data with each new knowledge. And that’s just to get information, not to mention the computational effort it takes to try to understand the context or parse out the meaning in a human conversation.
Because of these challenges, virtual assistants today focus on smaller tasks that tend to be personal (ordering an Uber or making a restaurant reservation) or professional (“tell me what’s on my calendar”).
With Cortana, Microsoft is betting a lot on the latter, a mission made possible by its 2018 acquisition of Semantic Machines. During a demonstration of Cortana for Quartz, the Semantic Klein co-founder described the experience of using a virtual personal assistant today as a series of isolated sessions. You start a session by asking a question or making a command, then that session ends. There are a few situations where you might be able to ask another question, but those interactions are “fragile,” he says, meaning side questions are usually limited. For example, if a virtual assistant asks you “Did this answer your question?” and you say “No”, it just restarts the session.
The next Cortana tries to break the standard of short, isolated sessions. In the demo, Klein asks what her day will look like tomorrow, to which Cortana responds by bringing up her calendar. He then asks where a lunch event is, and Cortana extracts the information from an event invite and displays it. He asks what the weather is like “out there” and Cortana pulls the weather forecast for the event location at the exact time of the event. He asks if there are any seats outside, and Cortana looks online and determines that there are none. In the middle of her round of questions, Klein asks Cortana to give her time to run an errand after her last date. Then he asks Cortana to organize an event after lunch and invite “Andy” and Andy’s manager. Cortana finds out which Andy he’s talking about, finds Andy’s manager, and invites them both to the meeting.
Sure, it was a premeditated demo using a fake calendar, but it was real code. A Microsoft representative told Quartz the questions were contemporaneous, based on what Klein knew the system could do.
“I think we can fundamentally help people save time to do what they want to do,” says Andrew Shuman, vice president of Cortana Engineering. “They spend so much time in front of Microsoft services and products that we owe it to our customers to give back their time.”
Google’s “personal” assistant
Google is also working from its own treasure trove of data, emphasizing in its case the “personal” aspect of the virtual personal assistant.
The company has made particular breakthroughs in its voice technology, under the Duplex brand. Last year, it demonstrated the ability to call local businesses on a user’s behalf to get information such as store hours, and it can also make appointments and reservations. Earlier this week, the company announced new features for Google Assistant that make even more use of Google’s huge database of user information. Starting this year, for example, Assistant will be able to reference data it has from Gmail to automatically fill in the information required to reserve a car on a rental website.
It’s not hard to imagine the vast universe of other personal data that Google Assistant could tap into, as many people plan leisure activities and manage their entire lives on Google services.
It’s not so much an AI breakthrough as a super-powerful autofill, made possible by Google’s ability to understand the personal lives of its users in increasingly intimate ways. Google may have ambitions to be the all-around assistant, but those ambitions are stifled by both AI limitations and market realities. Google has a huge treasure trove of personal connections, but its business and commerce division is overshadowed by that of Microsoft.
Each voice contestant struggled to gain traction by building a unique assistant. Amazon, which created the smart speaker company with its Alexa line of devices, has increased the number of devices Alexa inhabits, bringing the virtual personal assistant to wall clocks and microwaves. But that hasn’t significantly changed the kinds of interactions users have with these devices, at least not beyond the natural differences between wall clocks and microwaves. Apple’s Siri, the original mainstream virtual assistant, can call an Uber or order food on Caviar, but only because the company gave developers the ability to connect their software to Siri. The company hasn’t done much else to develop Siri’s proprietary technology in the past five years.
For now, these companies seem resigned to their inability to create a dominating assistant that people will actually use for work and play. Even Microsoft has partnered with Alexa so that one assistant can call another for the e-commerce needs of Cortana users. But a piece of the voice pie is better than no pie at all, and the tech giants hope the blurring of work and life will make any virtual assistant valuable in both areas. “It’s important to recognize that these kinds of workplace issues are universal issues,” Shuman says. “It’s not like I’m going home and not having to collaborate or plan things or manage tasks and to-do lists.”