photo by Aaron Yoo
Most mornings when I am home in Florida, I’ll say to an empty room: “Alexa, play news from BBC.” My nearest Amazon Echo speaker then plays a five-minute newscast from British Broadcasting Company.
Next, I’ll say, “Alexa, play news from CBC.” And on comes the latest update from Canada.
This routine works flawlessly on my Echo speaker in the family room, on the Echo Dot in my bedroom, or even on the Alexa app on my iPhone. It works when I’m doing the dishes or making coffee. It works even though it wouldn’t be that hard to confuse BBC and CBC. Alexa has been trained – by humans – to understand the difference in my speech.
And I’m OK with that.
The same thing is true in the springtime when I ask for the score of the previous night’s Mets game. Alexa knows the difference between the New York Mets and the Brooklyn Nets, who play on the same side of the East River.
Want to know Hugh Grant’s birthday? Your smart speaker can tell you, even if you come from New York City and pronounce his name “You Grant.” Smart speakers manage to work with people from Boston, western Scotland and other places where people speak my native tongue in ways I myself can’t always decipher.
We live in a world where voice recognition is woven into the fabric of our lives. Even if you don’t have a smart speaker like an Echo, Google Home or HomePod, your phone likely offers a virtual assistant: Siri for Apple users and Google Assistant for Android. This is not to mention a variety of other smart devices with voice commands including televisions, thermostats, cars and more.
But a contingent of users have been shocked to learn that teaching virtual assistants to understand human speech requires letting humans listen in from time to time.
Google recently agreed to suspend language reviews in Europe for three months, following regulatory concern and public outrage over revelations that contractors listen to audio captured by Google Assistant. The company says that such reviews are an important component of teaching its technology to understand different accents and dialects. Regulators in Germany expressed concern that the reviews included not only queries directed to Google Assistant, but also snippets of conversation the devices accidentally recorded.
A Google spokeswoman said that Google was cooperating with the Hamburg data protection authority. She also noted, “We don’t associate audio clips with user accounts during the review process, and only perform reviews for around 0.2% of all clips.” This figure is in line with a statement from David Monsees, the product manager for search at Google, following a contractor’s violation of the company’s data security policies in early July. That contractor leaked more than 1,000 recordings to Belgian news site VRT. The main point on which Google and German regulators seem to disagree is the prevalence of false positives – that is, Google Assistant recording when a user hasn’t activated it.
Google is not the only tech company to run into regulatory pearl-clutching. Apple, which has recently positioned itself as a company especially concerned with customers’ privacy, also recently acknowledged that humans analyze a small percentage of Siri requests to improve the service’s functionality. Like Google, Apple is suspending this review program for a time. (Unlike Google, however, Apple does not appear to be facing any regulatory orders connected to the practice.) Amazon, too, employs human reviewers to improve Alexa’s performance. And last year, Amazon acknowledged that one of its home speakers recorded a private conversation in error and sent it to a person on the device owner’s contact list.
Nor is it only European regulators who are concerned. In May, a coalition of privacy groups filed a complaint with U.S. regulators focused on the potential for Amazon to violate children’s privacy by recording young users’ conversations on its Echo Dot Kids devices. Lawsuits to similar effect are underway in federal court in Seattle and state court in California.
The concerns about smart speakers listening to us strike me as much ado about less than nothing. Maybe if I regularly organized drug deals in my home, I would want to unplug my Amazon Echo before the transactions took place. But for me, and for most users, this simply is not a problem. It is also worth noting that one of the stated aims of human review is to identify and cut down on accidental activations.
The benefits of speech recognition so vastly outweigh the overcaffeinated concerns as to be ludicrous. And that’s for people who like me who have no physical impairments to reading a screen or using a keyboard. For people with such limitations, smart speakers and other applications of voice recognition are life-changing. It is obvious why voice recognition is valuable to people who are blind or visually impaired, but they are not the only beneficiaries. These devices are also improving the lives of dementia patients, autistic people (especially children and teens), people with limited mobility or paralysis, and – perhaps surprisingly – people who are deaf or hard of hearing. The self-appointed privacy police seem to spare scarcely a thought for these users.
Will mistakes happen? Given enough opportunity, absolutely. Do they matter? In the big scheme of things, of course they don’t. Humanity did not abandon the wheel because people sometimes get struck by wheeled vehicles. We did not abandon knives because people sometimes cut themselves or others. And those are much more serious injuries than someone listening in for a few seconds of anything within earshot of our devices.