Categories
Augmented Reality Business Strategy

Augmented Reality SDK Confusion

I don't feel confused about AR SDKs but I wonder if some of those who are releasing new so-called AR SDKs have neglected to study the AR ecosystem. In my depiction of the Augmented Reality ecosystem, the "Packaging" segment is at the center, between delivery and three other important segments. 

Packaging companies are those that provide tools and services to produce AR-enriched experiences. Think of it this way: when content has been "processed" through the packaging segment, a user who has the right sensors detecting its context receives ("experiences") that content in context, or more specifically, in "camera view" (i.e., visually inserted over the physical world), as an "auditory" enrichment (i.e., a sound is produced for the user at a specific location or context) or "haptic" enrichment (i.e., the user feels something on their body when a sensor connects with some published augmentation that sends a signal to the user). That's all AR in a nutshell.

In the packaging segment we find many sub-segments. This includes at least the AR SDK and toolkit providers, the Web-hosted content publishing platforms and the developers that provide professional services to content owners, brands and merchants (often represented by their appointed agencies).

Everyone, regardless of the segment, is searching for a business model that will work for Augmented Reality in the long run. In order for value (defined for the moment as "something for which you either pay attention to or pay money for use") to flow through an ecosystem segment it's simple: you must have those that are buying-with their time or their money-and those who sell to the buyers. With the packaging segment in the middle, the likelihood is high that things that matter in the long run, that generate revenues, will involve this segment.

The providers of software development tools for producing AR-enriched experiences (aka AR SDKs) all have the same goal (whether they announce it or not). The "game" today, while exploring all possible revenue streams, is get the maximum number of developers on your platform. If you have more developers, you might get the maximum number of projects executed on/with your platform. It's the number of projects (or augmentations) that's the real metric that matters most. The SDK providers reach for this goal by attracting developers to their tools (directly or indirectly, using contests and other strategies) and/or by doing projects with content providers themselves (and thus competing with the developers). Cutting the developer segment out is not scalable and cannibalizing your buyers is not recommended either, but those are separate subjects.

For some purposes, and since it drives the use of their products, packaging companies rely on and frequently partner with the providers of enabling technologies, the segment represented in the lower left corner of the figure. More about that below.

Since we are in the early days and no one is confident about what will work, virtually all the packaging segment players have multiple products or a mix of products and services to offer. They use/trade technologies among themselves and are generally searching for new business models. And the enabling technology providers get in the mix as well.

The assumption is that if a company is using an SDK, they are somehow "locked in" and the provider will be able to charge for something in the future, or that, if you are a hardware provider, your chips will have an advantage accelerating experiences developed with your SDK. If manufacturers of devices learn that experiences produced using a very popular SDK are always accelerated with a certain chipset, they might sell more devices, hence order more of these chips, or pay a premium for them. This logic probably holds true as long as there aren't standards or open source alternatives to a proprietary SDK.

Let's step back to a few years ago when AR SDKs were licensed on an annual or project basis to developers. The revenue from licensing SDKs to third party developers on a project basis is the business model that was a primary revenue generator for computer vision-based SDK provider AR Toolworks, and annual licensing was relatively successful for the two largest companies (in terms of AR-driven revenues pre-2010), Total Immersion and metaio.  These were also the largest revenue generating models for over a dozen other less well-known companies until mid-2011. That's approximately when a "simple" annual or per-project licensing model was washed away, primarily by Qualcomm.

Although it is first an enabling technology provider (blue segment), Qualcomm released its computer vision-based SDK, Vuforia, with a royalty- and cost-free license in last days of 2010 and more widely in early 2011. To compound the issue, Aurasma (an activity of Hewlett Packard since the Q32011 HP acquisition of Autonomy) came out in April 2011 with their no-cost SDK. Qualcomm and Aurasma aren't the first ones to do this. No one ever talks about it any more, but Nokia Point & Find (officially launched in April 2009 after a long closed beta) was the pre-smartphone era (Symbian) version. It contained and exposed via APIs all the various (visual) search capabilities within Nokia and was released as a service platform/SDK. This didn't catch on for a variety of reasons.

So, where are we? Still unclear on why there are so many AR SDKs, or companies that say they offer them.

AR SDKs are easily and frequently confused with Visual Search SDKs. Visual Search SDKs permit a developer to use algorithms that match what's in the camera's view with images on which the algorithm was "trained," a machine learning term for processing an image or a frame of video and extracting/storing natural features in a unique arrangement (a pattern) which, when detected again in the same or a similar arrangement will produce a match. A Visual Search SDK leaves what happens after the match up to the developer. A match could bring up a Web page, like a match in a QR code scanner does. Or it could produce an AR-enriched experience.

Therefore, Visual Search can be used by and is frequently part of an AR SDK. Many small companies are providing "just" the Visual Search SDKs: kooaba, Mobile Acuity, String Labs, Olaworks, milpix, eVision among others. And apparently there's still room for innovation here. Catchoom, a Telefonica I&D spin-off that's going to launch at ARE2012, is providing the Visual Search for junaio and Layar's Vision experiences.

Another newcomer that sounds like it is aiming for the same space that Catchoom has in its cross hairs (provides "visual search for brands") is Serge Media Corporation, a company founded (according to its press release) by three tech industry veterans and funded by a Luxembourg-based consortium. The company introduced the SergeSDK. Here's where the use of language is fuzzy and the confusion is clear. The SergeSDK Web page says that Aurasma is a partner. Well, maybe HP is where they are getting the deep pockets for the $1M prize for the best application developed using their SDK! If Aurasma is the provider of the visual search engine, then the SergeSDK is actually only a search "carousel" that appears at the top of the application. Sounds like a case where Aurasma is going to get more developers using its engine.

Hard to say how well this will work in the long run, or over just the next year. There are few pockets deeper than those of Google and Apple when it comes to Visual Search (and AR). These companies have repeatedly demonstrated that they have been incubating the technologies and big plans are in store for us.

All right. Let's summarize. By comparison with other segments, the packaging segment of the AR ecosystem is a high risk zone. It will either completely disappear or explode. That's why there are so many players and everyone wants to get in the action!

Stayed tuned as in the next 6 months this segment undergoes the most rapid and unpredictable changes when Google and Apple make their entries.

Categories
Augmented Reality

GOING OUTSIDE

Spring (and the Greenville Avenue St. Patrick’s Day Parade) has brought Dallasites out in droves today. Locals tell me that this unusual. I’m reminded of this wonderful short article, originally published in The New Yorker March 28, 2011 issue.

JUST IN TIME FOR SPRING

By Ellis Weiner

Introducing GOING OUTSIDE, the astounding multipurpose activity platform that will revolutionize the way you spend your time.

GOING OUTSIDE is not a game or a program, not a device or an app, not a protocol or an operating system. Instead, it’s a comprehensive experiential mode that lets you perceive and do things firsthand, without any intervening media or technology.

GOING OUTSIDE:

1. Supports real-time experience through a seamless mind-body interface. By GOING OUTSIDE, you’ll rediscover the joy and satisfaction of actually doing something. To initiate actions, simply have your mind tell your body what to do—and then do it!

Example: Mary has one apple. You have zero apples. Mary says, “Hey, this apple is really good.” You think, How can I have an apple, too? By GOING OUTSIDE, it’s easy! Simply go to the market—physically—and buy an apple. Result? You have an apple, too.

Worried about how your body will react to GOING OUTSIDE? Don’t be—all your normal functions (respiration, circulation, digestion, etc.) continue as usual. Meanwhile, your own inboard, ear-based accelerometer enables you to assume any posture or orientation you wish (within limits imposed by Gravity™). It’s a snap to stand up, sit down, or lie down. If you want to lean against a wall, simply find a wall and lean against it.

2. Is completely hands-free. No keyboards, mice, controllers, touch pads, or joysticks. Use your hands as they were meant to be used, for doing things manually. Peeling potatoes, applauding, shooting baskets, scratching yourself—the possibilities are endless.

3. Delivers authentic 3-D, real-motion video, with no lag time or artifacts. Available colors encompass the entire spectrum to which human eyesight is sensitive. Blacks are pure. Shadows, textures, and reflections are beyond being exactly-like-what-they-are. They are what they are.

GOING OUTSIDE also supports viewing visuals in a full range of orientations. For Landscape Mode, simply look straight ahead—at a real landscape, if you so choose. To see things to the left or the right, shift your eyes in their sockets or turn your head from side to side. For Portrait Mode, merely tilt your head ninety degrees in either direction and use your eyes normally. Vision-correcting eyeglasses not included but widely available.

4. Delivers “head-free” surround sound. No headphones, earbuds, speakers, or sound-bar arrays required—and yet, amazingly, you hear everything. Sound is supported over the entire audible spectrum via instantaneous audio transmission. As soon as a noise occurs and its sound waves are propagated to your head, you hear it, with stunning realism, with your ears.

Plus, all sounds, noises, music, and human speech arrive with remarkable spatial-location accuracy. When someone behind you says, “Hey, are you on drugs, or what?,” you’ll hear the question actually coming from behind you.

5. Supports all known, and all unknown, smells. Some call it “the missing sense.” But once you start GOING OUTSIDE you’ll revel in a world of scent that no workstation, media center, 3-D movie, or smartphone can hope to match. Inhale through your nose. Smell that? That’s a smell, which you are experiencing in real time.

6. Enables complete interactivity with inanimate objects, animals, and Nature™. Enjoy the texture of real grass, listen to authentic birds, or discover a flower that has grown up out of the earth. By GOING OUTSIDE, you’ll be astounded by the number and variety of things there are in the world.

7. Provides instantaneous feedback for physical movement in all three dimensions. Motion through 3-D environments is immediate, on-demand, and entirely convincing. When you “pick up stuff from the dry cleaner’s,” you will literally be picking up stuff from the dry cleaner’s.

To hold an object, simply reach out and grasp it with your hand. To transit from location to location, merely walk, run, or otherwise travel from your point of origin toward your destination. Or take advantage of a wide variety of available supported transport devices.

8. Is fully scalable. You can interact with any number of people, from one to more than six billion, simply by GOING OUTSIDE. How? Just go to a place where there are people and speak to them. But be careful—they may speak back to you! Or remain alone and talk to yourself.

9. Affords you the opportunity to experience completely actual weather. You’ll know if it’s hot or cold in your area because you’ll feel hot or cold immediately after GOING OUTSIDE. You’ll think it’s really raining when it rains, because it is.

10. Brings a world of cultural excitement within reach. Enjoy access to museums, concerts, plays, and films. After GOING OUTSIDE, the Louvre is but a plane ride away.

11. Provides access to everything not in your home, dorm room, or cubicle. Buildings, houses, shops, restaurants, bowling alleys, snack stands, and other facilities, as well as parks, beaches, mountains, deserts, tundras, taigas, savannahs, plains, rivers, veldts, meadows, and all the other features of the geophysical world, become startlingly and convincingly real when you go to them. Take part in actual sporting events, or observe them as a “spectator.” Walk across the street, dive into a lake, or jump on a trampoline surrounded by happy children. After GOING OUTSIDE, you’re limited not by your imagination but by the rest of Reality™.

Millions of people have already tried GOING OUTSIDE. Many of your “friends” may even be GOING OUTSIDE right now! Why not join them and see what happens?

Categories
Augmented Reality Standards

Open and Interoperable AR

I’ve been involved and observed technology standards for nearly 20 years. I’ve seen the boom that came about because of the W3C's work and the Web standards that were put in place early. The standards for HTTP and HTML made content publishing for a wider audience much more attractive to the owners and developers of content than having to format their content for each individual end user application. 

I’ve also seen standards introduced in emerging industries too early. For example, the ITU H.320 standards in the late 1980s were too limiting and stifled innovation in the videoconferencing industry a decade later. Even though there was an effort to correct the problem in the mid-1990s with H.323, the architectures were too limiting and eventually much of the world went to SIP (IETF Session Initiation Protocol). But even SIP has only had limited impact when compared with Skype for the adoption of video calling. So, this is an example where although there are good standards available, they are implemented by large companies and the mass market just wants things that work, first time and every time.  AR is a much larger opportunity and probably closer to the Web than video conferencing or video calling.

With AR, there’s more than just a terminal and a network entity or two terminals talking to one another. As I wrote in my recent post about the AR Standards work, AR is starved for content and without widespread adoption of standards, publishers are not going to bother with making their content available. In addition to it being just too difficult to reach audiences on fragmented platforms, there’s not a clear business model. If, however, we have easy ways to publish to massive audiences, traditional business models such as premium content subscription and Pay to watch or experience, are viable.  

I don’t anticipate that mass market AR can happen without open AR content publishing and management as part of other enterprise platforms. The systems have to be open and to interoperate at many levels. That's why in late 2009 I began working with other advocates of open AR to bring experts in different fields together. We gained momentum in 2011 when the Open Geospatial Consortium and the Khronos Group recognized our potential to help. These two standards development organizations see AR as very central to what they provide. The use of AR drives the adoption of faster, high performance processors (which members of the Khronos Group provide) and location-based information.

There are other organizations very consistently participating and making valuable contributions to each of our meetings. In terms of other SDOs, in addition to OGC and Khronos, the W3C, two sub committees from ISO/IEC, Open Mobile Alliance, Web3D Consortium and Society of Information Display are reporting regularly about what they’re doing. The commercial and research organizations that attend include, for example, the Fraunhofer IGD, Layar, Wikitude, Canon, Opera Software, Sony Ericsson, ST Ericsson and Qualcomm. We also really value the dozens of independent AR developers who come and contribute their experience as well. Mostly they’re from Europe but at the meeting in Austin we expect to have a new crop of US-based AR developers showing up.

Each meeting is different and always very valuable. I'm very much looking forward to next week!

Categories
Augmented Reality Events

Aurasma at GDC12 and SXSW12

I was unable to attend the Game Developers Conference last week in San Francisco, but it sounds like it was a good event. I enjoyed reading Damon Hernandez's post on Artificial Intelligence. Damon and I are working together on the AR in Texas Workshops March 16 and 17.

At GDC12, Aurasma was in the ARM booth showing Social AR experiences. During this video interview David Stone gave some numbers and his excitement about the platform nearly leaves him speechless.

The SXSW event is going on this week and Aurasma is there as well. In Austin, Aurasma broke the news about their partnership with Marvel Comics. This is could have been good news for the future of AR-enhanced books. Unfortunately, the creative professionals who worked on this demonstration let us down. Watch the movie of this noisy animation showing what the character is capable of doing, and ask yourself "how many times does a "reader" want to watch this?"

I fear the answer is: Zero. Is there any aspect of this experience sufficiently valuable for a customer to return? I could be wrong.

What more could the character have done? Well, something related to the story of the comic book, for starters!

Categories
Augmented Reality Events Standards

Interview with Neil Trevett

In preparation for the upcoming AR Standards Community Meeting March 19-20 in Austin, Texas, I’ve conducted a few interviews with experts. See here my interview with Marius Preda. Today’s special guest is Neil Trevett.

Neil Trevett is VP of Mobile Content at NVIDIA and President of the Khronos Group, where he created and chaired the OpenGL working group, which has defined the industry standard for 3D graphics on embedded devices. Trevett also chairs the OpenCL working group at Khronos defining an open standard for heterogeneous computing.

Spime Wrangler: When did you begin working on standards and open specifications that are or will become relevant to Augmented Reality?

NT: It’s difficult to say because so many different standards are enabling ubiquitous computing and AR is used in so many different ways. We can point to graphics standards, geo-spatial standards, formatting, and other fundamental domains. [editor’s note: Here’s a page that gives an overview of existing standards used in AR.]

The lines between computer vision, 3D, graphics acceleration and use are not clearly drawn. And, depending on what type of AR you’re talking about, these may be useful, or totally irrelevant.

But, to answer your question, I’ve been pushing standards and working on the development of open APIs in this area for nearly 20 years. I first assumed a leadership role in 1997 as President of the Web3D Consortium (until 2005). In the Web3D Consortium, we worked on standards to bring real-time 3D on the Internet and many of the core enablers for 3D in AR have their roots in that work.

Spime Wrangler: You are one of the few people who has attended all previous meetings of the International AR Standards Community. Why?

NT: The AR Standards Community brings together people and domains that otherwise don’t have opportunities to meet. So, getting to know the folks who are conducting research in AR, designing AR, implementing core enabling technologies, even artists and developers was a first goal. I need to know those people in order to understand their requirements. Without requirements, we don’t have useful standards. I’ve been taking what I learn during the AR Standards community meeting and working some of that knowledge into the Khronos Group.

The second incentive for attending the meetings is to hear what the other standards development organizations are working on that is relevant to AR. Each SDO has its own focus and we already have so much to do that we have very few opportunities to get an in depth report on what’s going on within other SDOs, to understand the stage of development and to see points for collaboration.

Finally, the AR Standards Community meetings permit the Khronos Group to share with the participants in the community what we’re working on and to receive direct feedback from experts in AR. Not only are the requirements important to us, but also the level of interest a particular new activity receives. If, during the community meeting I detect a lot of interest and value, I can be pretty sure that there will be customers for these open APIs down the road.

Spime Wrangler: Can you please describe the evolution you’ve seen in the substance of the meetings over the past 18 months?

NT: The evolution of this space has been rapid, by standards development standards! This is probably because a lot of folks have thought about the potential of AR as just another way of interfacing with the world. There’s also been decades of research in this area. Proprietary silos are just not going to be able to cover all the use cases and platforms on which AR could be useful. 

In Seoul, it wasn’t a blank slate. We were picking up on and continuing the work begun in prior meetings of the Korean AR Standards community that had taken place earlier in 2010. And the W3C POI Working Group had just been approved as an outcome of the W3C Workshop on AR and the Web.

Over the course of 2011 we were able to bring in more of the SDOs. For example, the OGC and Web3D Consortium started presenting their activities during the Second community meeting. The OMA Mob AR Enabler work item presented and ISO SC24 WG 9 chair, Gerry Kim, participated in the Third Meeting in conjunction with the Open Geospatial Consortium’s meeting in Taiwan.

We’ve also established and been moving forward with several community resources. I’d say the initiation of work on an AR Reference Architecture is an important milestone.

There’s a really committed group of people who form the core, but many others are joining and observing at different levels.

Spime Wrangler: What are your goals for the meeting in Austin?

NT: During the next community meeting, the Khronos Group expects to share the progress made in the newly formed StreamInput WG. We’re just beginning this work but there’s great contributions and we know that the AR community needs these APIs.

I also want to contribute to the ongoing work on the AR Reference Architecture. This will be the first meeting in which MPEG will join us and Marius Preda will be making a presentation about what they have been doing as well as initiating new work on 3D Transmission standards using past MPEG standards.

It’s going to be an exciting meeting and I’m looking forward to participating!

Categories
3D Information Augmented Reality Innovation

Playing with Urban Augmented Reality

AR and cities go well together. One of the reasons is that, by comparison with rural landscapes, the environment is quite well documented (with 3D models, photographs, maps, etc). A second reason is that some features of the environment, like the buildings, are stationary while others, like the people and cars, are moving. Another reason for these to fit naturally together is that there's a lot more information that can be associated with places and things than those of us passing through can see with our "naked" eyes. There's also a mutual desire: people –those who are moving about in urban landscapes, and those who have information about the spaces–need or want to make these connections more visible and more meaningful.

The applications for AR in cities are numerous. Sometimes the value of the AR experience is just to have fun. Let's imagine playing a game that involves the physical world and information encoded with (or developed in real time for use with) a building's surface. Mobile Projection Unit (MPU) Labs is an Australian start up doing some really interesting work that demonstrates this principle. They've taken the concept of the popular mobile game "Snake" and, by combining it with a small projector, smartphone and the real world, made something new. Here's the text from their minimalist web page:

"When ‘Snake the Planet!” is projected onto buildings, each level is generated individually and based on the selected facade. Windows, door frames, pipes and signs all become boundaries and obstacles in the game. Shapes and pixels collide with these boundaries like real objects. The multi-player mode lets players intentionally block each other’s path in order to destroy the opponent."

Besides this text, there's a quick motivational "statement" by one of the designers (this does not play in the page for me, but click on vimeo logo to open it):

 

 

And this 2 minute video clip of the experience in action:

I'd like to take this out for a test drive. Does anyone know these guys?

Categories
Augmented Reality Events

AR@MWC12

I'm heading to Barcelona for Mobile World Congress, the annual gathering of the mobile industry. It's always an exciting event and I meet a lot of interesting companies, some with whom I'm already acquainted, industry leaders, some new to the segments on which I focus but well known in mobile, and others that I've never heard of.

When I arrive in Barcelona on Sunday, I'm going to begin using the MWC AR Navigator, an application developed by mCRUMBS in collaboration with GSMA, the organizer of MWC, to make getting around the city and the event easier and efficient as possible (and with the assistance of AR, of course).

On Monday Feb 27 my first priority will be the Augmented Reality Forum. This half-day conference is sponsored by Khronos Group and four Khronos member companies: Imagination Technologies, ST Ericsson, ARM and Freescale. Through the AR Forum presentations, these companies are driving mobile AR awareness and sharing how their new hardware and open APIs from Khronos will improve AR experiences.

After the presentations, I will be moderating the panel discussion with the speakers. Join us!

In the following days, my agenda includes meetings with over 50 companies developing devices, software and content for AR experiences. Many of those with who I will meet don't have an actual booth. Finding these people among the 60,000 attendees would be impossible without appointments scheduled in advance (and the aid of the MWC AR Navigator)! Are you attending MWC and want to set aside a quick briefing with me, please contact me as soon as possible.

If you haven't booked meetings but want to see for yourself what's new for AR in 2012, I recommend that you at least drop by these booths for demonstrations:

  • Imagination Technologies – Hall 1 D45
  • ST Ericsson partner zone – Hall 7 D45
  • ARM – Hall 1 C01
  • Freescale – AV27
  • Qualcomm – Hall 8
  • Nokia/NAVTEQ – Hall 7
  • Alcatel-Lucent – Hall 6
  • Aurasma (HP) –Hall 7
  • Texas Instruments – Hall 8 A84
  • Intel – Hall 8 B192
  • VTT – Hall 2 G12
  • mCRUMBS and metaio – Hall 2.1 C60
  • HealthAlert App
 – Hall 2.1E65
  • Augmented Reality Lab – Hall 
2H47
  • Blippar – Avenue AV35
  • BRGR Media
 – Hall 2 F49
  • Pordiva – Hall 2 E66
  • wöwbile Mobile Marketing – Hall 7 D85

These are not the only places you will see AR. If you would like me to add others to this list, please leave a comment below.

Categories
Augmented Reality Innovation

Square Pegs and Round Holes

For several years I've attempted to bring AR (more specifically mobile AR) to the attention of a segment of media companies: those that produce and sell print and digital content (of all kinds) as a way of bringing value to both their physical and digital media assets (aka "publishers").

My investments have included writing a white paper and a position paper. I've mused on the topic in blog posts (here and here), conducted workshops, and traveled to North America to present my recommendations at meetings where the forward-looking segment of the publishing industry gathers (e.g., Tools of Change 2010, NFAIS 2011).

I've learned a lot in the process but I do not think I've had any impact on these businesses. As far as the publishers of books, newspapers, magazines and other print and digital content (and those who manage) are concerned, visual search is moderately interesting but mobile AR technology is a square peg. It just has not fit in the geometry of their needs (a round hole).

With their words and actions, publishers have demonstrated that they are moving as quickly as they possibly can (and it may not be fast enough) towards “all digital.” Notions of extending the life expectancy of their print media assets by combining them with interactive digital annotations are misplaced. They don’t have these thoughts. I was under the illusion that there would be a fertile place, at least worthy of exploration, between page and screen, so to speak. Forget it.

After digesting this and coming back to the topic (almost a year since having last pushed it) I’ve adjusted my thinking. Publishers are owners and managers of assets that are (and increasingly will be) used interactively. The primary difference between the publisher and other businesses that have information assets is that the publisher has the responsibility to monetize the assets directly, meaning by charging for the asset, not some secondary product or service. Relative sizes and complexity of digital archives could also be larger in a company that I would label “publisher,” but publishers come in all sizes so this distinction is, perhaps, not valuable.

Publishers are both feeding and reliant upon the digital media asset production and distribution ecosystem. Some parts of the ecosystem are the same  companies that served the publishers when their medium was print. For example, companies like MarkLogic and dozens of others (one other example here), provide digital asset management systems. When I approached a few companies in the asset management solution segment, they made it clear that if there’s no demand for a feature, they’re not going to build it.

Distribution companies, like Barnes & Noble and Amazon, are key to the business model in that they serve to put both the print (via shipping customers and bookstores) and digital (via eReaders) assets in the hands of readers (the human type).

Perhaps this is where differentiation and innovation with AR will make a difference. I hope to explore if and how the eReader product segment could apply AR technology to sell more and more often.
 
 

Categories
Augmented Reality Events Standards

Interview with Marius Preda

On March 19 and 20, 2012 the AR Standards Community will gather in Austin, Texas. In the weeks leading up to the next (the fifth) International AR Standards Community meeting, sponsored by Khronos Group and the Open Geospatial Consortium, experts are preparing their position papers and planning contributions.

I am pleased to be able to share a recent interview with one of the participants of the upcoming meeting, Marius Preda. Marius is Associate Professor with the Institut TELECOM, France, and the founder and responsible of GRIN – Graphics and Interactive Media. He is currently the chairperson of MPEG 3D Graphics group. He has been actively involved in MPEG since 1998, especially focusing on Video and 3D Graphics coding. He is the main contributor of the new animation tools dedicated to generic synthetic objects. More recently, he is the leader of MPEG-V and MPEG AR groups. He is also the editor of several MPEG standards.

Marius Preda’s research interests include 3D graphics compression, virtual character, rendering, interactive rich multimedia and multimedia standardization. And, he also leads several research projects at the institutional, national and European and international levels.

Spime Wrangler:  Where did you first learn about the work going on in the AR Standards community?

MP: In July 2011, during the preparations of the 97th MPEG meeting, held in Turin, Italy, I had the pleasure to meet Christine Perey. She came to the meeting of MPEG-V AhG, a group that is creating, under the umbrella of MPEG, a series of standards dealing with sensors, actuators and, in general, the frontier between physical and virtual world.

Spime Wrangler:  What sorts of activities are going on in MPEG (ISO/IEC JTC1 SC29 WG11) that are most relevant to AR and visual search? Is there a document or white paper you have written on this topic?

MP: Since 1998, when the first edition of MPEG-4 was published, the concept of mixed – natural and synthetic – content was made possible in an open and rich standard, relatively advanced for that time. MPEG-4 was not only advancing the compression of audio and video, but also introducing, for the first time, the compression for graphics assets. Later on, MPEG revisited the framework for 3D graphics compression and grouped in Part 16 of MPEG-4 several tools allowing compact representation of 3D assets.

Separately, MPEG published in 2011 the first edition of MPEG-V specification, a standard defining the representation format for sensors and actuators. Using this standard, it is possible to deal with data from simplest sensors such as temperature, light, orientation, position to very complex ones such as biosensors and motion cameras. Similarly for actuators. From the simple vibration effect today embedded in almost all the mobile phones to complex motion chairs such the ones used in 4D theatres, these can all be specified in standard-compliant libraries.

Finally, several years ago, MPEG standardized MPEG-7, a method for describing descriptors attached to media content. This work is currently being extended. With a set of compact descriptors for natural objects, we are working on Visual Search. MPEG has also ongoing work on compression of 3D video, a key technology in order for realistic augmentation of the captured image to be provided and rendered in real time.

Based on these specifications and the significant know-how in domains highly relevant to AR, MPEG decided in December 2011 to publish an application format for Augmented Reality, grouping together relevant standards in order to build a deterministic, solid and useful model for AR applications and services.

More information related to MPEG standards is available here.

Spime Wrangler:  Why are you going to attend the meeting in Austin? I mean, what are your motivations and what do you hope to achieve there?

MP: The objective of AR Standards is laudable but, at the same time, relatively difficult to achieve. There are currently several, probably too many, standardization bodies that claim to deliver relevant standards for AR to the industry. Our duty, as standard organizations, is to provide interoperable solutions. This is not new. Organizations, including Standards Development bodies, always try to use mechanisms such as liaisons or to cooperate rather than to compete.

A recent very successful example of this is the work on video coding jointly created by ISO/IEC MPEG and ITU-T VCEG and published individually under the names MPEG-4 AVC and h.264 respectively. In fact, there is exactly the same document and a product compliant with one is implicitly compliant with the second. My motivation in participating to Austin meeting is to verify if such collaborative approach is possible as well in the field of AR.

Spime Wrangler: Can you please give us a sneak peak into what you are going to present and share with the community on March 19-20?

MP: I’ll present two aspects of MPEG work related to AR. In a first presentation, I’ll talk about the MPEG-4 Part 16 and Part 25. The first is proposing a set of tools for 3D graphics compression, the second an approach on how to apply these tools to scene graph representations other than the one proposed by MPEG-4, e.g. COLLADA and X3D. So, as you can see, there are several AR-related activities going on in parallel.

In the second presentation I’ll talk about the MPEG Augmented Reality Application Format (ARAF), and MARBle, an MPEG browser developed by the TOTEM project for AR (currently available for use on Android phones). ARAF is an ongoing activity in MPEG and early contact with other standards body may help us all to work towards the same vision of providing a one stop solution for AR applications and services.

Categories
Augmented Reality

Augmented Vision 2

It’s time, following my post on Rob Spence’s Augmented Vision and the recent buzz in the blog-o-sphere on the topic of eyewear for hands-free AR (on TechCrunch Feb 6, on Wired on Feb 13, on Augmented Planet Feb 15), to return to this topic.

I could examine the current state of the art of the technology for hands-free AR (the hardware, the software and the content). But there’s too much information I could not reveal, and much more I have yet to discover.

I could speculate about if, what and when Google will introduce its Goggles, as been rumored for nearly 3 months. By the way, I didn’t need a report to shed light on this. In April 2011, when I visited the Google campus, one of the people with whom I met (complete with his personal display) was wearable computing guru and director of the Georgia Institute of Technology Contextual Computing Group, Thad Starner. A matter of months later, he was followed to Google by Rich deVaul whose 2003 dissertation on The Memory Glasses project certainly qualifies him on the subject of eyewear.  There could, in the near future, be some cool new products rolling out for us, “ordinary humans,” to take photos with our sunglasses and transfer these to our smartphones. There might be tools for creating a log of our lives with these, which would be very helpful. But these are not, purely speaking, AR applications.

Instead, let me focus on who, in my opinion, is most likely to be adopting the next generation of non-military see-through eyewear for use with AR capabilities. It will not be you nor I, or the early technology adopter next door who will have the next generation see-through eyewear for AR. 

It will be those for whom having certain, very specific pieces of additional information available in real time (with the ability to convey them to others) while also having use of both hands, is life saving or performance enhancing. In other words, professional applications are going to come first. In the life saving category, those who engage in the most dangerous field in the world (i.e., military action) probably already have something close to AR.

Beyond defense, let’s assume that those who respond to a location new to them for the purpose of rescuing people endangered by fire, flooding, earthquakes, and other disasters, need both of their hands as well as real time information about their surroundings. This blog post on the Tanagram web site (where the image above is from), makes a very strong case for the use of AR vision.

People who explore dark places, such as underwater crevices near a shipwreck or a mine shaft already have cameras on their heads and suits that monitor heart rate, temperature, pressure and other ambient conditions. The next logical step is to have helpful information superimposed on the immediate surroundings. Using cameras to recognize natural features in buildings (with or without the aid of markers) and then altimeters to determine the depth underground or height above ground to which the user has gone, floor plans and readings from local activity sensors could be very valuable for saving lives. 

I hope never to have to rely on these myself, but I won’t be surprised if one day I find myself rescued from a dangerous place by a professional wearing head-mounted gear with Augmented Reality features.

Categories
Augmented Reality Business Strategy

Between Page, Screen, Lake and Life

In my post entitled Pop-up Poetry I wrote about the book/experience Between Page and Screen. Print, art and AR technology mix in very interesting ways, including this one, but I point out (with three brilliant examples) that this project is not the first case of a "magic book."

Like many similar works of its genre, Between Page and Screen uses FLARToolKit to project (display) images over a live video coming from the camera that is pointed at the book's pages. Other tools used in Between Page and Screen include the Robot Legs framework, Papervision for 3D effects, BetweenAS3 for animation and JibLib Flash. Any computer (whose user has first downloaded the application) with a webcam can play the book, which will be published in April. Ah ha! I thought it was available immediately, but now learn that one can only pre-order it from SiglioPress.com.

And, as suggests Joann Pan in her post about the book on Mashable, "combining the physicality of a printed book with the technology of Adobe Flash to create a virtual love story" is different. Pan interviewed the author and writer of the AR code. She writes, "Borsuk, whose background is in book art and writing, and Bouse, developing his own startup, were mesmerized by the technology. The married duo combined their separate love of writing and technology to create this augmented reality art project that would explore the relationship between handmade books and digital spaces."

The more I've thought about this project and read various posts about Between Page and Screen, in recent days, the more confident I am that I might experience a magic book once or twice, but my preferred reading experience is to hold a well-written, traditional book. I decided to come back to this topic after I read about another type of "interactive" book on TechCrunch.

First thing that caught my eye was the title. Fallen Lake. Fallen Leaf Lake! Of course! I used to live in the Sierra Nevada mountains, where the action in this novel is set, and Fallen Leaf Lake is an exceptionally beautiful body of water. But the post by John Biggs points out that the author of Fallen Lake, Laird Harrison, is going to be posting clues and "extra features" about the characters in the book by way of a password protected blog.

All these technology embellishments on books seem complicated. They're purpose, Biggs believes, is to differentiate the work in order to get some tech blogger to write about the book and then, maybe, sell more copies.

Finally, Biggs points out that what he really wants in a book, what we all want and what will "save publishing," is good (excellent) writing. Gimmicks like AR and blog posts might add value, but first let's make sure the content is well worth the effort.

Categories
3D Information Augmented Reality Innovation

Improving AR Experiences with Gravity

I’m passionate about the use of AR in urban environments. However, having tested some simple applications, I have been very disappointed because the sensors on the smartphone I use (Samsung GalaxyS) and the alogrithms for feature detection we have commercially are not well suited to show me really stable or very precise augmentations over the real world.

I want to be able to point at a building and get specific information about the people or activities (e.g., businesses) within at a room-by-room/window-and-door level of precision. Instead, I’m lucky if I see small 2D labels that jiggle around in space, and don’t stay “glued” to the surface of a structure when I move around. Let’s face it, in an urban environment, humans don’t feel comfortable when the nearby buildings (or their parts) shake and float about!

Of course, this is not the only obstacle to urban AR use and I’m not the first to discover this challenge. It’s been clear to researchers for much longer. To overcome this in the past some developers used logos on buildings as markers. This certainly helped with recognizing which building I’m asking about and, based on the size of the logo, estimating my distance from it, but there’s still low precision and poor alignment with edges.

In 4Q 2011 metaio began to share what its R&D team has come up with to address this among other issues associated with blending digital information into the real world in more realistic ways. In the October 27 press release, the company described how, by combining gravity awareness with camera-based feature detection, it is able to improve the speed and performance of detecting real world objects, especially buildings.

The applications for gravity awareness go well beyond urban AR. “In addition to enabling virtual promotions for real estate services, the gravity awareness in AR can also be used to improve the user experience in rendering virtual content that behaves like real objects; for example, virtual accessories, like a pair of earrings, will move according to how the user turns his or her head.”

The concept of Gravity Alignment is very simple. It is described and illustrated in this video:

Earlier this week (on January 30, 2012), metaio released a new video about what they’ve done over the past 6 months to bring this technology closer to commercial availability. The video below and some insights about when gravity aligned AR will be available on our devices have been written up in Engadget and numerous general technology blogs in recent days.

I will head right over to the Khronos Group-sponsored AR Forum at Mobile World Congress later this month to see if ARM will be demonstrating this on stage and to learn more about the value they expect to add to make Gravity Aligned AR part of my next device.