Not all of us want to talk to our technology. Do we have a choice?

The future has become very talkative, which is bound to make some people uncomfortable.

In recent Google I/O event barrage Worldwide Developers Conference (WWDC), many of the new features include interacting with the AI by speaking to it through phone (or devices such as Smart glassesin the case of Google). And with the new Siri AIWe also saw Apple presenters talking through their iPhones during the WWDC keynote, explaining all the new ways people can interact with the virtual assistant.

This push toward a more voice-focused future sounds like progress, but it assumes that everyone is comfortable thinking out loud, which could further alienate people who may already be wary of AI.

One of the most notable developments in artificial intelligence in recent years has been the ability to interact with large language models in a conversational manner. We’ve gone from issuing direct commands to responding to chatty responses from an AI that feels like it’s also trying hard to be your best friend.

In fact, one of the breakthroughs announced at Google I/O was Gemini’s ability to analyze our fragmented human speech patterns, including all the ums, ahs, and broken sentences, to figure out what we’re really saying. I can almost imagine a sick but frustrated AI waiting with the words “just get to it already” on its virtual face.

But that’s the point, isn’t it? It’s already easy to think of Gemini or Siri (or primarily texting with Claude or ChatGPT) as single entities and interact with them the way I talk to a friend while strolling down the sidewalk, bouncing ideas back and forth.

The difference is that when I talk to the AI, I stand in a public place and talk to myself.

You can say that this is not a big problem now. It’s common to see people making calls in public places wearing Apple AirPods or other wireless earbuds. We’ve normalized the body language and specific pause-and-answer interaction of someone speaking on a call without actually holding the phone to their ear. Even if we don’t see the earbuds, we assume that’s what they do. It wasn’t that long ago that making a phone call in public was considered rude.

But not everyone is so verbal. As a writer, I tried using dictation (including a spell as a broken collarbone didn’t leave me much choice), but it always felt natural to write words through my fingers. Speaking and writing are two separate disciplines, even if they share a language.

Using a person’s voice as an interface is great for on-stage demos, but in many contexts it’s the best (or only) option: hands need to stay on the wheel while driving, and smart glasses don’t have keyboards. And for people who can’t see screens easily, I imagine voice and conversation recognition LLMs are really useful.

This is also a social problem. It’s bad enough when people use speakerphone in public places to make calls (often on topics that should be private) without any regard for the people around them. Now will everyone need to go through their party planning or attempts to secure a restaurant reservation? It is a further erosion of respect for the people around us.

It forms another barrier to actual communication. If you see someone dressed nicely, you can politely ask them where they got that outfit. Now, you can take a photo and ask AI to recognize it, losing a moment of human connection and It simultaneously looks like a creep creeping up and taking a photo.

Standards are changing with technology, so I’m sure there will be a level of (bemoaning) acceptance of people talking to seemingly no one while they’re interacting with their devices.

But are we heading towards a world surrounded by overlapping conversations where no one talks to each other? Tapping our phones, watches, glasses, and AI pins sounds like a lot of hype at a time when people are already tired of AI.

Leave a ReplyCancel Reply

Trending now