Philosophers, psychologists, and neuroscientists have long studied how and why humans talk to themselves as they navigate tasks, manage decisions, and solve problems. "Inner speech is the silent conversation that most healthy human beings have with themselves," says Alain Morin, a professor in the Department of Psychology at Mount Royal University in Alberta, Canada.
Now, as robotics marches forward and devices increasingly rely on a combination of machine learning and conventional programming to navigate the physical world, researchers are exploring ways to imbue machines with an inner voice. This "self-consciousness" could provide insight and feedback into how and why these systems make decisions, while helping them improve at various tasks.
The impact on service robots, computer speech systems, virtual assistants, and autonomous vehicles could be significant. "A cognitive architecture for inner speech may be the first step toward functional aspects of robot consciousness. It may represent the beginning of a new domain for human-robot interactions," explains Antonio Chella, professor of robotics and director of the RoboticsLab at the University of Palermo in Italy.
Applying the principles of human speech and cognition to machines is a steep challenge. "Consciousness is a very complex and fuzzy term," observes Angelo Cangelosi, professor of machine learning and robotics at the University of Manchester in the U.K. "While machines may not be aware in the way humans are aware, the idea of modeling speech characteristics to enrich interactions could deliver deeper insight into machine behavior."
The idea is taking shape. In April 2021, Chella and University of Palermo research fellow Arianna Pipitone equipped a robot named Pepper from SoftBank Robotics with a cognitive architecture that models inner speech. This allowed the robot to reason and interact at a deeper level—and to generate vocal feedback about how it arrived at answers and actions. This is possible because the parameters and attributes of the inner voice are different than those for outward expression, the researchers note.
For example, the robot might try to explain to itself why it couldn't do something, such as when an item was unreachable, or a task conflicted with its programming. In one scenario, the researchers asked Pepper to set a table for lunch. Relying on programmed rules, it began the task. But when the researchers intentionally asked the robot to place a napkin in the wrong spot, something that countered its programming, Pepper used ordinary language to express confusion and upset.
Suddenly, the robot was more than a black box. It said that while it would never intentionally break the rules, it couldn't upset the human so it would go along with the request. "Our goal was to improve the transparency and the quality of the interaction between humans and robots. Through inner speech, the robot becomes more trustworthy. It can explain its behavior and its underlying decision-making processes," Pipitone says.
After conducting a series of experiments that caused conflict, the researchers arrived at a startling conclusion: Through self-dialog, Pepper learned how to perform tasks better. Its completion rate improved because it engaged in a continual "dress rehearsal" that allowed it to think through scenarios ahead of time.
Voice of Reason
Not surprisingly, linguistics serve as the basis for a robot inner voice. Morin, whose human behavioral models served as the foundation for the experiment with Pepper, says that inner self-awareness and speech provide a vehicle for enhanced cognition in humans, but also possibly machines. "The inner voice is crucial. It acts like as a little internal mirror which allows us to gain a more objective perspective about ourselves."
Cangelosi says the experiments represent a step forward in building more adaptable and dynamic robots and systems, but significant challenges remain. Recognizing objects such as a water glass or a butter knife isn't difficult; today's AI can detect shapes well. Embedding linguistics and even emotional components, however, is incredibly complex. "Once you get to abstract words and thinking, it's a different scenario," he points out.
Nevertheless, the nascent field will likely advance over the next few years. Cangelosi believes it eventually could allow people to ask a machine, such as an autonomous vehicle or a virtual assistant, why it behaved a certain way or how it deals with a situation when it can't perform the requested action. Through machine learning and machine social interactions, an inner voice might also help other devices learn together.
In addition, Chella points out that computational models of inner speech could possibly be used as tools for neuroscientists to investigate specific normal and pathological conditions in humans. "This might include people who hear voices and people who are deaf from birth and therefore unable to generate normal forms of inner speech."
Concludes Pipitone, "This represents a new methodology for developing trustworthy human-robot interactions. We believe that inner speech may be a key to new and more transparent algorithms of machine learning."
Samuel Greengard is an author and journalist based in West Linn, OR, USA.