Categories
Augmented Reality Research & Development

Project Glass: The Tortoise and The Hare

Remember the Aesop’s fable about the Tortoise and the Hare? 11,002,798 viewers as of 9 AM Central European Time April 10, 2012. Since April 4, 2012 Noon Pacific Time, in five and a half days, over the 2012 Easter holiday weekend, the YouTube “vision video” of Google’s Project Glass has probably set a benchmark in terms of how quickly a short, exciting video depicting a cool idea can spread through modern, Internet-connected society. [update April 12, 2012: here’s an analysis of what the New Media Index found in the social media “storm” around Project Glass.]

The popularity of the video (and the Project Glass Google+ page with 187,000 followers) certainly demonstrates that beyond a few hundred thousand digerati who follow technology trends, there’s a keen interest in alternative ways of displaying digital information. Who are these 11M viewers? Does YouTube have a way to display the geo-location of where the hits originate?

Although the concepts shown in the video aren’t entirely new, the digerati are responding and engaging passionately with the concept of handsfree, wearable computing displays. I’ve seen (visited) no fewer than 50 blog posts on the subject of the Project Glass. Most are simply reporting on the appearance of the concept video and asking if it could be possible. There are those who have invested a little more thought.

Blair MacIntyre was one of the first to jump in with his critical assessment less than a day after the announcement. He fears that success (“winning the race”) to new computing experiences will be compromised by Google going too quickly when slow, methodical work will lead to a more certain outcome. Based on the research in Blair’s lab and those of colleagues around the world, Blair knows that the state-of-the-art on many of the core technologies necessary for this Project Glass vision to be real is too primitive to deliver (reliably in the next year) the concepts shown in the video. He fears that by setting the bar as high as the Project Glass video has, expectations will be set too high and failure to deliver will create a generation of skeptics. The “finish line” for all those who envisage a day when information is contextual and delivered in a more intuitive manner will move further out.

In a similar “not too fast” vein, my favorite post (so far, we are still less than a week into this) is Gene Becker‘s April 6 post (48 hours after announcement) on his The Connected World blog. Gene shares my fascination with the possibility that head-mounted sensors like those proposed for Project Glass would lead to continuous life capture. Continuous life capture has been shown for years (Gordon Bell has spent his entire career exploring it and wrote Total Recall, other technologies are actively being developed in projects such as SenseCam) but we’ve not had all the right components in the right place at the right price. Gene focuses on the potential for participatory media applications. I prefer to focus on the Anticipatory services that could be furnished to users of such devices.

It’s not explicitly mentioned, but Gene points out something I’ve raised and this is my contribution to the discuss about Project Glass with this post: think about the user inputs to control the system. More than my fingertips, more of the human body (e.g., voice, gesture) will be necessary to control a hands-free information capture, display and control system. Gene writes “Glass will need an interaction language. What are the hands-free equivalents of select, click, scroll, drag, pinch, swipe, copy/paste, show/hide and quit? How does the system differentiate between an interface command and a nod, a word, a glance meant for a friend?”

All movement away from keyboards and mice as input and user interface devices will need a new interaction language.

The success of personal computing in some way leveraged a century of experience with the typewriter keyboard to which a mouse and graphical (2D) user interface were late (recent) but fundamental additions. The success of using sensors on the body and in the real world, and the objects and places as interaction (and display) surfaces for the data will rely on our intelligent use of more of our own senses, use of many more metaphors between the physical and digital world, and highly flexible, multi-modal and open platforms.

Is it appropriate for Google to define its own handsfree information interaction language? I understand that the Kinect camera point of view is 180 degree different from that of a head-mounted device, and it is a depth camera, not a simple and small camera on the Project Glass device but what can we reuse and learn from Kinect? Who else should participate? How many failures before we get this one right? How can a community of experts and users be involved in innovating around and contributing to this important element of our future information and communication platforms?

I’m not suggesting that 2012 is the best or the right time to be codifying and to put standards around voice and/or gesture interfaces but rather recommending that when Project Glass comes out with a first product, it should include an open interface permitting developers to explore different strategies for controlling information. Google should offer open APIs for interactions, at least to research labs and qualified developers in the same manner that Microsoft has with Kinect, as soon as possible.

If Google is the hasty hare, as Blair suggests, is Microsoft the “tortoise” in the journey to provide handsfree interaction? What is Apple working on and will it behave like the tortoise?

Regardless the order of entry of the big technology players, there will be many others who notice the attention Project Glass has received. The dialog on a myriad of open issues surrounding the new information delivery paradigm is very valuable. I hope the Project Glass doesn’t release too soon but with virtually all the posts I’ve read closing by asking when the blogger can get their hands on and nose under a pair, the pressure to reach the first metaphorical finish line must be enormous.

Categories
Augmented Reality

Augmented Vision 2

It’s time, following my post on Rob Spence’s Augmented Vision and the recent buzz in the blog-o-sphere on the topic of eyewear for hands-free AR (on TechCrunch Feb 6, on Wired on Feb 13, on Augmented Planet Feb 15), to return to this topic.

I could examine the current state of the art of the technology for hands-free AR (the hardware, the software and the content). But there’s too much information I could not reveal, and much more I have yet to discover.

I could speculate about if, what and when Google will introduce its Goggles, as been rumored for nearly 3 months. By the way, I didn’t need a report to shed light on this. In April 2011, when I visited the Google campus, one of the people with whom I met (complete with his personal display) was wearable computing guru and director of the Georgia Institute of Technology Contextual Computing Group, Thad Starner. A matter of months later, he was followed to Google by Rich deVaul whose 2003 dissertation on The Memory Glasses project certainly qualifies him on the subject of eyewear.  There could, in the near future, be some cool new products rolling out for us, “ordinary humans,” to take photos with our sunglasses and transfer these to our smartphones. There might be tools for creating a log of our lives with these, which would be very helpful. But these are not, purely speaking, AR applications.

Instead, let me focus on who, in my opinion, is most likely to be adopting the next generation of non-military see-through eyewear for use with AR capabilities. It will not be you nor I, or the early technology adopter next door who will have the next generation see-through eyewear for AR. 

It will be those for whom having certain, very specific pieces of additional information available in real time (with the ability to convey them to others) while also having use of both hands, is life saving or performance enhancing. In other words, professional applications are going to come first. In the life saving category, those who engage in the most dangerous field in the world (i.e., military action) probably already have something close to AR.

Beyond defense, let’s assume that those who respond to a location new to them for the purpose of rescuing people endangered by fire, flooding, earthquakes, and other disasters, need both of their hands as well as real time information about their surroundings. This blog post on the Tanagram web site (where the image above is from), makes a very strong case for the use of AR vision.

People who explore dark places, such as underwater crevices near a shipwreck or a mine shaft already have cameras on their heads and suits that monitor heart rate, temperature, pressure and other ambient conditions. The next logical step is to have helpful information superimposed on the immediate surroundings. Using cameras to recognize natural features in buildings (with or without the aid of markers) and then altimeters to determine the depth underground or height above ground to which the user has gone, floor plans and readings from local activity sensors could be very valuable for saving lives. 

I hope never to have to rely on these myself, but I won’t be surprised if one day I find myself rescued from a dangerous place by a professional wearing head-mounted gear with Augmented Reality features.

Categories
Augmented Reality Social and Societal Standards

Virtual Public Art Project

Some believe that experiencing art in digital forms while interacting with or set in real world settings will be a widely adopted use case for Augmented Reality. People will be able to experience more examples of artistic expression, in different places and to contribute by expressing themselves through their software and mobile devices. Projects to explore the interaction of digital and physical objects are quite popular at the annual SIGGRAPH event.

One of the earliest projects using the Layar AR browser for artistic expression in public (and private) spaces is the Virtual Public Art Project begun in March 2010 by Christopher Manzione, a New York City artist and sculptor. Manzione created the VPAP by publishing his creations in a dedicated layer. The site says:

VPAP is the first mobile AR outdoor art experience ever, and maximizes public reception of AR art through compatibility with both iPhone 3GS and Android phones using the free Layar application.

Artists around the world have invested in putting their digital work into the VPAP layer. Projects like this one certainly have the potential to dramatically change how people interact with one another and with art, especially if they are also able to leave their comments or opinions about the artist's work.

One of the limitations of the current VPAP, and perhaps a reason it has not received attention since the fall/winter of 2010-2011, is that it is only viewable on one browser. If there were standards for AR formatting, as there are today for formatting content viewed in a Web browser, then any viewer application, capable of detecting the user's context such as the AR browsers from wikitude and metaio (junaio) would also provide access to the artists' work. In an ideal world one source of content could offer all users the same or similar experiences, using their software of choice.

In the face of multiple proprietary technology silos (and client applications with projects requiring wide, browser-neutral audiences), some AR experience developers offer services based on a single back end with interfaces to each of the publishing platforms. Examples include the Hoppala Augmentation by Hoppala Agency, BuildAR by MOB Labs and MARways by mCRUMBS. In each case, these platforms streamline the publishing process for the content creator to have the widest possible reach.

Will they also need to write interfaces to the next AR browsers? What will these platforms be capable of when Web browsers also support AR?

I guess these are not questions on which artists should be spending their time.