When Alvaro Cassinelli, the winner of the 2011 grand prize at Laval Virtual, the largest annual Virtual Reality conference, was asked by the Guardian what motivated him to develop a platform using Augmented Reality and everyday objects to represent a user's request, his reply revealed something to which we should all pay attention.
Cassinelli said "non-verbal communication was (and still is) the most reliable device I have when I want to avoid ambiguity in everyday situations." He was referring to the fact that as a South American living in Japan, there are many times when communication is unclear.
One doesn't have to be living with cultural and linguistic barriers to need gestures. I learned the value of technology-assisted non-verbal communications 20 years ago. During one of my first sessions using a personal videoconferencing system in my home office with a client who was then working at Apple, his words and his gesture did not align! He said "maybe" in response to a recommendation I made, but the posture of his head and the position of his hands said "no way." This was an "ah ha" moment that convinced me how valuable technology could be to share non-verbal communications when meetings involve a remote participant.
In 2004, when I started working with the partners of the EU funded (FP 6) Augmented Multiparty Interaction project, one of the objectives of using computer vision was to analyze the non-verbal communications in gestures and to compare these with the spoken words during a business meeting. One of the conclusions of the project's research was that computers can detect when there is a discrepancy between verbal and non-verbal signals, but they cannot determine which of the two messages is the one the user intended to communicate.
If and when gestures become a way of communicating our instructions to a digital assistant, will we all need to learn to use the same gestures? or will we individually train the systems watching us to recognize the gestures we designate? These are a few of questions I raised in the position paper AR Human Interfaces: The Case of Gestures. I don't have the answers to these questions, but I'm certain that it will take multiple disciplines working together over many iterations to get us to an optimal balance of standard and personal gestures, just as we have in other forms of communication.
Cassinelli won a prize for innovation and I'm confident that he's on to something, but it will be several more years before gestures are reliable for man-machine interfaces.