Just in case you have not listened to the telephone calls placed by Google’s latest conversational application code-named “Duplex” you should.
Google Duplex can place calls and make restaurant reservations or book appointments to hair salons. Nothing terribly exciting here. Speech recognition and spoken dialogue technology has had the capability of successfully completing such tasks for over a decade now. What is exciting and frankly also a bit scary is that for the few (probably cherry-picked) recordings made available at the Google AI blog, the robotic caller is practically indistinguishable from a human.
Google Duplex is not the first conversational system that passes the Turing test with flying colors for a few minutes: chatbots have been competing in this arena for years now. Mitsuku, the chatbot from PandoraBots, has been acing the Turing test for the past three years, passing as a human for a longer time, producing chats that are both more complex, more intelligent and more emotionally aware than Google Duplex. For details on the Loebner prize competition and past winners follow the link.
So where is the excitement in Duplex? Clearly not on the chatbot component of Duplex but rather on the dialogue component. Not on what the Google bot says (the generated chat) but on how it is being said (the generated speech). The intonation, pauses, hesitations etc. make the robotic caller sound very very much human.
Let’s do a deep dive into the technology. Is Google Duplex a quantum leap technologically? Not really. As far as the spoken dialogue technology, what we call discourse management and natural language generation, Google Duplex is state-of-the-art but does not represent a revolution. Spoken dialogue systems that let humans make reservations (and other transnational tasks) have been created in the research lab as early as the 1990’s and the industry since the turn of the century — and they are becoming (sometimes annoyingly) ubiquitous. As far as the speech generation or speech-to-text components, the Google researchers use standard tools that are tweaked for their purposes. What is really new here is the model that generates the intonational patterns fed into the generation component, i.e., how things are being said to make Duplex sound as human as possible. When everything works (both what is being said and how things are being said is ‘right’), the results are stunning. Here is a more polemic (but less technical) opinion from a Wired columnist.
So is Google Duplex just hype? Not true either. Google Duplex represents a quantum leap. The researchers at Google had the audacity to believe that the current state-of-the-art in AI can pass the Turing test for spoken dialogue systems. They got the data, trained the models and connected all the dots. Kudos!
So why does Google Duplex scare us? We are already experienced in talking and chatting with machines to get our customer service, receiving robotic calls for telemarketing and polling. We have also learned to live with conversational assistants in our home and soon in our car and in our office. The answer is simple: receiving a call from Google Duplex makes us feel ‘cheated’ and ‘manipulated’. Duplex has fooled us! Made us think that it is a fellow human, while in reality we are speaking with a machine. According to our commonsense moral code the machine is acting unethically and since machines don’t have morality, the creators are labeled as unethical. Venture Beat’s editor thinks so, and he is not alone.
Google marketing sensed the danger and a couple of days following the news splash about Duplex declared that from now on the robot will reveal its robotic identity. Especially in Europe with GDPR coming into effect ‘dishonest’ robotic callers could be especially damaging to Google’s reputation. Thus the immediate clarification from Google.
So now the ‘honest’ robot will no longer pretend to be human. Why should it sound human then? Why not sound something like this conceivably:
“Hello, my name is Google Duplex, I am an honest robot calling you to book an adult haircut appointment between ten and noon, this Thursday… with a preference towards earlier time… with Kelsey.”
And all the fun and magic is gone!
Or maybe putting some of the fun back with the ‘honest Trump’ robot that might sound something like:
“Hello, I am calling on behalf of John but I am really a robot that has been trained to sound like your president, Donald Trump. I want to have a haircut, it is gonna be a beautiful haircut, …the best haircut, … with Stormy … scrap that… with Kelsey, noonish tomorrow … ok?”
All in all, Google Duplex is a quantum leap in engineering, not technology. Google Duplex is mischievous but not deceitful. It might not be politically correct but it is not unethical. Let us not lose the forest for the trees here. To conclude, here are some important questions that we should be asking as technologists and potential users of Duplex:
- Is there any advantage for Google Duplex to sound human? Do you get a lower reservation success rate from the ‘honest’ robot or the ‘mischievous’ robot?
- Could a criminal potentially use this technology to harm people… other than hurt their pride or waste their time?
- Does Google Duplex save time or waste time in the workplace? How much time is wasted in unsuccessful calls? Would Google Duplex get you banned from your favorite hair salon? 🙂
- What is next? Can we create robots that can have human-like conversations for more interesting (non-trivial) tasks?
- When can we expect to have emotional and behavioral intelligence in voice robots? See also our efforts at Behavioral Signals
This article appeared first on Medium. The editor is Alexandros Potamianos, CEO and co-founder of Behavioral Signals. Alex is a well-regarded innovator in the field of speech and natural language processing, interactive voice response systems and behavioral informatics. He has over 20 years of leadership experience both in the corporate and entrepreneurial sides of business. His background includes working at AT&T Labs-Research, Bell Labs and Lucent Technologies. His academic achievements have gone hand in hand with his extensive research work, receiving his M.Sc. and Ph.D. degrees in Engineering Sciences from Harvard University and later his MBA from the Stern School of Business, NYU. He’s a Professor at NTUA and currently living in Los Angeles where he enjoys kayaking and sailing.
Follow Alex on Twitter.