Live From ISMAR ’08: Augmented Reality Demo Round Up

ISMAR ’08 is the epicenter of the world’s best augmented reality demos. Here are the audience favorite picks:

The most beautiful demo

Markerless Magic Books

Created by the only artist at ISMAR ’08…
(on the left menu bar click Interaction/Haunted Book)

Our demonstration shows two artworks that rely  on recent Computer Vision and Augmented Reality techniques  to animate  the illustrations  of poetry books.  Because  we don’t need markers, we  can achieve seamless integration of  real  and  virtual  elements  to  create  the  desired atmosphere.  The visualization is done on a computer screen to avoid cumbersome Head-Mounted Displays. The camera is hidden into a desk lamp for easing even more the spectator immersion. Our work is the result of a collaboration between an artist and Computer Vision researchers. It shows beautiful and poetic augmented reality. It is further described in our paper ‘The Haunted House’.

Camille Scherrer, Julien Pilet, Vincent Lepetit (EPFL)

The most invisible demo

Sensor-fusion Based Augmented Reality with off the Shelf Mobile Phone

OK, you see a Scandinavian guy standing in the middle of the yard with a cell phone held high in his hand. What’s the big deal ? Exactly!

We demonstrate mobile augmented reality applications running on the newly released Nokia 6210 Navigator mobile phone. The device features an embedded 3D compass, 3D accelerometer, and assisted GPS unit – the fundamental ingredients for sensor-based pose estimation, in addition to smart-phone standards: forwards-pointing camera, high-resolution displays and internet connection. In our applications sensor based pose estimation is enhanced with computer vision methods and positioning error minimization techniques. Also the user interface solutions are designed to try to convey the relative uncertainty of the pose estimate to the user in intuitive ways.

Markus Kähäri, David J. Murphy (Nokia Research Center)

The most 90’s demo

See-Through Vision for Mobile Outdoor Augmented Reality

(compare to the previous demo)

We have developed a system built on our mobile Augmented Reality platform that provides users with see- through vision, allowing visualization of occluded objects textured with real-time video information. The demo participants will be able to wear our lightweight, belt- mounted wearable computer and head mounted display. The display will render hidden locations captured from the University of South Australia. These locations consist of 3D models of buildings and courtyard areas that are textured with pre-recorded video images. The system includes a collection of visualizations and tools that assist with viewing these occluded real-world locations; e.g. digital zoom and texture highlighting.

Benjamin Avery, Bruce H. Thomas, Wayne Piekarski, Christian Sandor  (University of South Australia)

The most playful mixed-reality game demo

Mobile Phone Augmented Reality

In our demo booth we will show a compilation of recent developments created by the Handheld AR group at Graz University of Technology and Imagination Computer
Services. None of these demos has been shown before at a scientific conference making it a unique experience for every ISMAR attendee. All our demos are hands-on: During our demos we will hand out devices and let people experience our applications.

Daniel Wagner, Alessandro Mulloni, Tobias Langlotz  (TU Graz), Istvan Barakonyi (Imagination),  Dieter Schmalstieg (TU Graz)

The most crowded demo

Superimposing Dynamic Range
In a dark, corner room the size of a closet, about 150 people are gathering around an artifact from the future….

We present a simple and low-cost method of superimposing high dynamic range visualizations onarbitrary reflective media, such as photographs, radiological paper prints, electronic paper, or even reflective three-dimensional items. Our technique is based on a secondary modulation of projected light when being surface reflected. This allows boosting contrast, perceivable tonal resolution, and color saturation beyond the possibility of projectors, or the capability of spatially uniform environment light when illuminating such media. It holds application potential for a variety of domains, such as radiology, astronomy, optical microscopy, conservation and restoration of historic art, modern art and entertainment installations.

Oliver Bimber (Bauhaus-University Weimar), Daisuke Iwai (Osaka University)

The most iTouchy demo

Multimodal Mobile Augmented Reality on the iPhone

How do you spell ARToolkit in iPhonese? (Hype is a beautiful thing)

In this demonstration we show how the Apple iPhone can be used as a platform for interesting mobile phone based AR applications, especially because of its support for multimodal input. We have ported a version of the ARToolKit library to the iPhone and customized it for the unique input capabilities of this platform. The demo shows multimarker-based tracking, virtual object rendering and AR overlay, gesture-based interaction with shared virtual content, and accelerometer input. This demonstration shows some of the possibilities of AR when there is no hardware to configure, no interface to learn, and the interaction is natural and intuitive.

Philip Lamb (ARToolworks)

The most down-under demo

An Augmented Reality Weather System

You have to live down-under to conceive a machine that simulates bad weather…brilliant!

This demo presents ARWeather, a simulation application, which can simulate three types of precipitation: rain, snow, and hail. Our goal is to fully immerse the user in the simulated weather by multimodal rendering of audio and graphics, while preserving autonomous and free movement of the user. Therefore, ARWeather was developed and deployed on the Tinmith wearable computer system. Software highlights of this demo include: GPU-accelerated particle systems and video processing, spatial audio with OpenAL, and physics-based interaction of particles with the environment (e.g., hail bounces of the ground).

Marko Heinrich (U. Koblenz-Landau), Bruce H. Thomas (U. South Australia), Stefan Mueller (U.Koblenz-Landau), Christian Sandor (U. South Australia)

The most highbrow demo

AR Museum Presentation Room

I never would have learned about this ancient plate’s history – had AR not been invented. A Classic.

The artwork to which the augmented reality technology is applied, is a plate produced by the technique called metallic lustre. Around the exhibited real artwork,
information is provided by multimedia tools, offering the visitor various approaches to the artwork. Adding information with augmented reality is intuitive and offers an illustration of something that cannot be seen by the naked eye, without turning away the visitor’s eyes from the real artwork. The system is currently in use at the Louvre – DNP Museum Lab (LDML) – Tokyo/Japan.

T. Miyashita (Dai Nippon Printing), P. Meier (metaio), S. Orlic (Musée du Louvre), T. Eble, V. Scholz,  A. Gapel, O. Gerl, S. Arnaudov,  S. Lieberknecht (metaio)

The “I am waaaay ahead of you” demo

Mapping large environments using multiple maps for wearable augmented reality

video of last year's demo

A demonstration of a wearable robotic system that uses an extended version of the parallel tracking and mapping system by Klein and Murray from ISMAR 2007. This extended version allows multiple independent cameras to be used to build a map in unison, and to also create multiple independent maps around an environment. The user can explore an environment in a natural way, acquiring local maps in real-time. When revisiting those areas the system will select the correct local map and continue tracking and structural acquisition, while the user views relevant AR constructs registered to that map.

Robert Castle, Georg Klein & David W. Murray (University of Oxford)

Additional great demos weren’t included due to the lack of space on this post and lack of sleep of the author…

Live from ISMAR ’08: Awards, Winners, and Wrap up of the World’s Best Augmented Reality Event

They say, every good thing has an end…and this event is no exception; ISMAR ’08, the world’s most important augmented reality event, is coming to a close with in a high note and with fireworks (augmented, of course).

That’s the part where the event chairs recognize the organizers which have made it possible, and thank the keynote speakers, paper submitters, demo exhibitors, poster presenters, competition contenders, and all participants for making it such a memorable event.

Cut to…flashback. It’s last night at King’s College; Ron Azuma is the MC for the best paper award ceremony…

King's College "Cafeteria"

And the honorable mention goes to: Georg Klein and David Murray “Compositing For Small Cameras”….winners of last year’s best paper…this is excellent work…many other practitioners can use these results”

Best student paper (tied with best paper): Sheng Liu, Dewen Cheng, Hong Hua “An Optical See-Through Head Mounted Display with Addressable Focal Planes”…it’s a breakthrough…this strikes me as a memorable and important step forward for HMD technology

Best paper: Daniel Wagner, Gerhard Reitmayr, Alessandro Mulloni, Tom Drummond, Dieter Schmalstieg “Pose Tracking from Natural Features on Mobile Phones”…“WOW! SIFT and the Ferns running in (almost) real-time on mobile phones…the performance is truly impressive and opens the door to amazing future applications

Congrats. You guys are the 800 pound Gorillas in augmented reality.

Cut to…flash forward. It’s the present back in the Cambridge Engineering Department. The winners of the Tracking Competition are about to be announced by the competition team:

Tracking Competition setup

We defined the setup in a large room in the department, with reference points and coordinates and installed 8 different stations with many different objects in them. We made it really hard on the competitors. We gave them time to prepare; they got coordinates of 16 items which they had to pick using their AR tracking technology.

We started with 5 contenders: Metaio (Tobias Eble), Fraunhofer (Harald Whuest), University of Bristol (Sudeep Sundaram), Millennium 3 Engineering (Mark Fiala), and University of Oxford (Georg Klein). Mark Fiala unfortunately had to drop due to lack of sufficient preparation time. Bristol thought the room was missing some features…

And here are the results: in the second place came Metaio with 15 items picked in a little more than 10 minutes…and in first place [the audience favorite] Georg Klein who picked all 16 items in a record time 8:48!


Georg (Le magnifique) will return to Oxford with an extra 1000 pounds in his pocket. And he’s humble and gracious:

Thanks to Robert Castle for providing the method (parallel tracking and mapping) which I adapted this morning – it was greatly suitable for the task.

And for those who wonder what kind of bug drove me to write more than 10,000 words in 17 posts within 4 days – I have one word for you: passion…plus the amazing support I got from ISMAR attendees and chairs, and mostly – you guys: AR avid fans out there, that weren’t as fortunate and couldn’t attend the event this year. THANK YOU!

They also say it’s never over ’till the fat lady sings…and in this case Christopher Stapleton plays the role: he’s the last to come up on stage and his deep voice vibrates across the walls of the auditorium as he shouts into the mic:

ISMAR 2009 Experience starts right now!

If you want to be part of it, help or support it – just send a note to

Fade out. Credits. Slideshow. We’re outta here.

Live from ISMAR ’08 : Perfecting Augmented Reality

The last session of ISMAR ’08 is about to begin, and it concentrates on perfecting Rendering and Scene Acquisition in augmented reality and making it even more realistic.

First on stage is Yusaku Nishin with a challenging talk attempting Photometric registration by adaptive high dynamic range image generation for augmented reality.

His goal : development of photorealistic augmented reality with a High Dynamic Range (HDR) image.

Estimating the lighting environment of virtual objects is difficult because of low dynamic range cameras. In order to overcome this problem, they propose a method that estimates the lighting environment from an HDR image and renders virtual objects using an HDR environment map. Virtual objects are overlaid in real-time by adjusting the dynamic range of the rendered image with tone mapping according to the exposure time of the camera. The HDR image is generated from multiple images captured with various exposure times.

Now you are ready to watch the resulted effect. Incredible.



Next on stage is the soon-to-be-hero-of-the-show Georg Klein (more on that later…) Compositing for Small Cameras

Blending virtual items on real scenes. It can work with small cameras. Video from such cameras tend to be imperfect (blurring, over saturation, radial distortion, etc) so when you impose a virtual item it tend to stick out in a bad way. Since we can’t improve the live video – we will try to adapt the virtual item to match the video at hand. Simply put, Georg samples the background and applies it to the image which matches blur, radial distortion, rotation, color saturation, etc) and he does it in 5 millisecond on a desktop… For details check the pdf paper; take a look for yourself and tell me if it works on Kartman:

Done! Georg is already working on the next challenge.


Following is Pished Bunnun introduces his work: OutlinAR: an assisted interactive model building system with reduced computational effort

Building 3D models interactively and in place (in-situ), using a single camera, and low computational effort – with a makeshift joystick (Button and wheels.)

In this case the video does a better job at explaining the concept than any number of words would…

Pished demonstrates it’s fast and pretty robust. You judge for yourself.

If you absolutely need more words about this – start here.

The team’s next challenge: make curved lines…


In the very last talk of the event Jason Wither courageously takes on another challenge to perfecting augmented reality, with his talk: Fast Annotation and Automatic Model Construction with a Single-Point Laser Range Finder

Jason is using a laser finder typically used by hunters (though he will not be shooting anything or anybody), mounted on the head or handheld, in conjunction with a parallel camera. First he wants to create an annotation. that’s totally trivial. But you can then orient the annotation according to a building for example.

Next, he is going to correct occlusion of virtual objects by real objects for improved augmented realism. Just click before and after the object and pronto:

Finally he will create a 3D model of an urban environment semi-automatically, by creating a depth map courtesy of the laser. To achieve that he’s using a fusion process. You got to see that video; the laser’s red line advancing on buildings reminds me the blob swallowing the city in that quirky Steve McQueen movie.

In conclusion this is a really low cost and fast approach for modeling and annotation of urban environments and objects. That capability would become extremely handy once Augmented Reality 2.0 picks up and anyone would want to annotate the environment (aka draw graffiti without breaking the law).

Next is the event wrap up and the results of the Tracking Competition. Stay tuned.


From the ISMAR ’08 program:

Rendering and Scene Acquisition

  • Photometric registration by adaptive high dynamic range image generation for augmented reality
    Yusaku Nishina, Bunyo Okumura, Masayuki Kanbara, Naokazu Yokoya
  • Compositing for Small Cameras (pdf paper)
    Georg Klein, David Murray
  • OutlinAR: an assisted interactive model building system with reduced computational effort
    Pished Bunnun, Walterio Mayol-Cuevas
  • Fast Annotation and Automatic Model Construction with a Single-Point Laser Range Finder
    Jason Wither, Chris Coffin, Jonathan Ventura, Tobias Hollerer

Live from ISMAR ’08: Is Augmented Reality at Work Better than Reality Itself ?

Bruce Thomas introduces the afternoon session at ISMAR ’08 focusing on user studies in industrial augmented reality.

First is Johannes Tuemler which will talk about Mobile Augmented Reality in Industrial Applications: Approaches for Solution of User-Related Issues.

The study looks at psychological and ergonomic factors in augmented reality usage and create a requirements catalog for mobile AR assistance systems in diverse scenarios. This was a collaboration with Volkswagen, Ergonomics department in Ott-von-Wolfsburg,  Perception Psychology from Weymar University, and Information technology by the Fraunhofer Institute.

The reference scenario chosen was “AR picking”, where subjects would work for a couple of hours of picking items from shelves using a mobile AR device. The users reported no rise of stress level with an AR system compared with no AR (except for some visual discomfort). Since the AR system was less than optimal, the research may point to the fact that with a better AR system the stress level of workers – compared with no AR system – could be reduced!


As a direct follow up to the first study, Bjoern Schwerdtfeger comes on stage to describe the results of an Order Picking with AR work.

Traditionally the system includes a print out with instructions of what items to pick from bins on shelving.

How can an AR system help improve the performance of such an activity?

Glasstron by Nomad

They looked at mulitple visualization options: Frame tunnel, Rings tunnel, and 3D Arrow.

The results showed that the frame visualization was more efficient than the arrow. It’s not clear whether the rings visualization is superior.


Final speaker for this session is Gerhard Schall from Graz University to discuss Virtual Redlining for Civil Engineering in Real Environments.

What is virtual redlining? Virtually annotation paper maps or 2d digital information systems (mostly for the utility sector). This process helps significantly in the workflows associated with network planning or inspection.

The process involved mapping of 2D geographical data with 3D models of buildings and underground infrastructure. The tool developed allows for collaboration, inspection, and annotation.

Results of the usage study confirms that the AR system has significant advantage in civil engineering – in this redlining scenario. The color coding was important, as well as the digital terrain model.

Question from the audience: where do you get the 3D modeling of the piping?

Answer: Some utility companies have started to map the underground infrastructure. But in most cases we create it based on 2D maps which is only an approximation.

And that concludes the Industrial user studies session. See you next at the last session of the event: Rendering and Scene Acquisition, leading to the grand finale with the award ceremony for the winner of the Tracking Competition.


From ISMAR Program:

User studies in Industrial AR

  • Mobile Augmented Reality in Industrial Applications: Approaches for Solution of User-Related Issues
    Johannes Tuemler, Ruediger Mecke, Michael Schenk, Anke Huckauf, Fabian Doil, Georg Paul, Eberhard A. Pfister, Irina Boeckelmann, Anja Roggentin
  • Supporting Order Picking with AR
    Bjoern Schwerdtfeger, Gudrun Klinker
  • Virtual Redlining for Civil Engineering in Real Environments
    Gerhard Schall, Erick Mendez, Dieter Schmalstieg

Live From ISMAR 08: Augmented Reality Sensors and Sensor Fusion

The last day of ISMAR ’08 is upon us, and the day opens by stimulating our senses with a session about sensors.

Gabriele Bleser starts this session with a talk about Using the marginalised particle filter for real-time visual-inertial sensor fusion

She starts by showing a short clip with an erratic camera motion that makes everyone dizzie…it actually proves an important capability that she studied which creates less jitter and less requirements imposed on the camera.

She explains the basics of particle filter and the use of inertial measurement.  In the past researchers studied standard particle filter. This is the first study using the a marginalised particle filter.

Testing using the new technique (non linear state space model with linear Gaussian substructure for real time visual inertial pose estimation) with 100 particles resulted in increased robustness against rapid motions.

To prove: Gabriele shows the rapid camera movements once again…

Well, we have to suffer now so that in the future users won’t have to. Kudos Gabriele.


Next is Daniel Pustka with Dynamic Gyroscope Fusion in Ubiquitous Tracking Environments. This is part of Gudrun Klinker’s journey towards Ubi-AR.

What you need for ubiquitous tracking is automatic discovery of tracking infrastructure, and shield applications from tracking details.

Gyroscopes are very interesting to use (low latency, high update rate, always available), but they have drawbacks (drift, only  for rotation) and are only usable when fused with other sensors.

Daniel and team have proved that the ubiquitous tracking tool set consisting of spatial relationship graphs and patterns is very useful to analyze tracking setups including gyroscopes. It allows a Ubitrack system to automatically infer occasions for gyroscope fusion in dynamically changing tracking situations.


Jeroen Hol presents Relative Pose Calibration of a Spherical Camera and an IMU

This study builds on the idea that by combining vision and inertial sensors  you get accurate real time position and orientation in a robust and fast motion, and this is very suitable for AR applications. However, calibration is the essential point for this to work.

An easy to use algorithm has been developed and yields results with real data.

Ron Azuma asks: When the image is captured in high motion does it create blur?

Jeroen answers that it can be addressed by changing some parameters.


Last for this session is Wee Teck Fong from NUS to discuss A Differential GPS Carrier Phase Technique for Precision Outdoor AR Tracking.

The solution that Fong presents provides good accuracy with low jitter, drift and low computational load – and no resolution ambiguities. It works well for outdoor AR apps. With just one GPS you get an accuracy of about 10 meters plus you get high jitter of the tracking. Differential GPS using 2 GPS receivers (low cost 25mm sized) improves the accuracy of tracking. Fong and team have taken it a steps further with an advanced computational model that delivers higher precision for outdoor AR tracking. Fong claims that with a more expensive receiver he can achieve a less than 1mm accuracy, but you can’t use this technique anywhere. An infrastructure of stationary GPS stations transmitting wirelessly could provide a wide constant coverage for this technique.

Fong concludes with a positive note regarding the upcoming European update to the GPS system dubbed Galileo (in 5 years) were things will get significantly better.


From ISMAR ’08 Program

  • Using the marginalised particle filter for real-time visual-inertial sensor fusion
    Gabriele Bleser, Didier Stricker
  • Dynamic Gyroscope Fusion in Ubiquitous Tracking Environments
    Daniel Pustka, Gudrun Klinker
  • Relative Pose Calibration of a Spherical Camera and an IMU
    Jeroen Hol, Thomas Schoen, Fredrik Gustafsson
  • A Differential GPS Carrier Phase Technique for Precision Outdoor AR Tracking
    Wee Teck Fong, S. K. Ong, A. Y. C. Nee

Live from ISMAR ’08: The Gods of Augmented Reality About the Next 10 Years

Welcome to the climax of ISMAR ’08. On stage the 9 “gods” of the augmented reality community. And they are siting in a panel to muse about the next 10 years of augmented reality.

Dieter Schmalstieg took on the unenviable job of moderating this crowd of big wigs. See if he can curb them down to 3 minutes each.

Here is a blow-by-blow coverage of their thoughts.

Ron Azuma (HRL)

The only way for AR to succeed is when we insert AR into our daily lives – it has to be available all the time (like Thad Starner from GA Tech which always wears his computer)
Ron asks – What if we succeed? what are the social ramifications? those who have thought about it are science fiction writers…such as Vernor Vinge (have you read Rainbows End and Synthetic Serendipity.)

Reinhold Behringer (Leeds)

AR is at the threshold of broad applications.
Cameras, GPS, bandwidth have improved immensely – split into lo-fi AR, approximate registration, low end hardware. and also hi end AR, live see through displays, etc.
What’s missing is APIs, common frameworks, ARML descriptor (standardization)

Mark Billinghurst (HitLab NZ)

Mobility (now) – It took 10 years to go from backpack to palm
Ubiquity (5+ years) – how will AR devices work with other devices (TV, home theater, …),
Sociability – it took us 10 years to go from 2 to 4 to 8 users . When will we have massive scale?
Next is AR 2.0 with massive user generated content and a major shift from technology to user interaction

Steve Feiner – Columbia

AR means “The world = your user interface”
What will it take to make this possible?
Backpacks are ridiculous; handheld devices will look ridiculous 5 years from now – so don’t write off eyewear.
A big one is dynamic global databases for identification/tracking of real world objects. Tracking could be viewed as “just” search (granted a new kind of search.)
There is more to AR than registration; AR presentations need to be designed (AR layouts).

Gudrun Klinker – TU Munchen

|ntegrating AR with ubiquitous. We are interfacing with reality, with our senses and others are mental. We need those lenses to connect to our “senses” (not just visually – it could also be sound, etc). Combining the virtual with the real – where is the information? and can we see it? How do we communicate with the stationary world? We need to connect with the room we are in and hear the “story”. The devices at least need to talk to each other.
We also need to think about “augmented” building, they do not evolve as fast as cell phones. Another aspect is how are we going to survive “this thing”. We need much more usability studies and connect it with real world applications. The ultimate test (I challenge you to show it in next year’s competition) is a navigation system for runners. It’s easy to do it for cars – but may be harder for people.

Nassir Navab –  TU Munchen

Medical augmented reality  – showing fascinating videos of medical overlays [add videos]

The simplest idea is getting into the operation room – combining X Ray and optics as part of the common operating workflow.

Next is fusion of pre/intra operative functional and anatomical imaging; patient motion tracking and deformable registration; adaptive, intuitive and interactive visualization; Integration into surgical workflow
Finally we need to focus on changing the culture of surgeons (e.g. training with AR simulation).

Haruo Takemura – Osaka University

Showing a table comparing the pros and cons of hardware platforms: e.g. mobile have potential benefits vs HMD (but also drawbacks – such as processing power); desktop is cheap and powerful but not mobile (tethered).
Cell phones have another issue – they are tied to the carriers which is problematic for developers.

Bruce Thomas – UniSA

We are extremely interdisciplinary – and should keep it up.
However with so many of these it’s hard to develop and evaluate. And by the way innovation is difficult to articulate.
We are in a “Neat vs. Scruffy” situation – the bottom line is that a smaller self-contained pieces of research is easier to get in front of the community – and get results.

Questions floating:
is high end or low end AR the goal?
is ubiquity in AR realistic or wishful thinking?
are we innovative/.
Does augmented reality need to make more money to survive?
Platforms: Don’t write off eyewear?
Social: what if we succeed with AR?
What is the position of ISMAR in the scientific community?

A controvertial question from the audience to the panel: How many of you have subject matter expert working in your office on a daily basis? (few hands) How many of you have artists working a daily basis? (even fewer hands) How many of your research have reached the real world? (once again – few hands)

A question from the audience about the future of HMD. Mark takes the mic and asks the audience:

How many of you would wear a head mounted display? (5 hands)

How many of you would wear a head mounted display that looks like a normal glasses? (75% of the audience raise hands)

Dieter asks the panel members to conclude with one sentence each (no semi columns…)

Ron: I want to refer to the comment that the cell phone is too seductive. We should make it indispensable so users won’t want to give it up – just like a cell phone.

Mark: We need to make sure that children, grandparents, in Africa and everywhere – could use AR

Steve: You ain’t seen nothing yet; look at the progress we have made in the last 10 years! No one can predict what will happen.

Gudrun: We have to be visionary and on the other hand. We need to be realistic and make sure RA doesn’t end up like AI…don’t build hopes in areas where people shouldn’t have them…don’t let AR get burned…

Nassir: Next event we should include designers and experts from other disciplines; and create solutions that go beyond the fashion

Haruo: Maybe combining information like Googles with devices

Bruce: I want you to have fun and be passionate about what you do! We can change the world!

Applause, and that’s a wrap.

Live from ISMAR ’08: Tracking – Latest and Greatest in Augmented Reality

After a quick liquid adjustment, and a coffee fix – we are back with the next session of ISMAR ’08, tackling a major topic in augmented reality: Tracking.

Youngmin Park is first on stage with Multiple 3D Object Tracking. His first demonstration is mind blowing. He shows an application that tracks multiple 3D objects, which have never been done before – and is quite essential for an AR application.

The approach combines the benefits of multiple approaches while avoiding their drawbacks:

  • Match input image against only a subset of keyframes
  • Track features lying on the visible objects over consecutive frames
  • Two sets of matches are combined to estimate the object 3d poses by propagating errors

Conclusion: Multiple objects are tracked in interactive frame rate and is not affected by the number of objects.

Don’t miss the demo.


Next two talks with Daniel Wagner from Graz university about his favorite topic Robust and Unobtrusive Marker Tracking on Mobile Phones.

Why AR on cell phones? there are more than a billion phones out there and everyone knows how to use them (which is unusual for new hardware).

A key argument, Daniel is making: Marker tracking and natural feature tracking are complementary. But we need a more robust tracking for phones, and create less obtrusive markers.

The goal: Less obtrusive markers. Here are 3 new marker designs:

The frame markers (the frame provides the marker while the inner area is used to present human readable information.

The split marker (somewhat inspired by Sony’s by the eye of judgment) we use barcode split, with a similar thinking to the frame marker.

A third marker is a Dot marker. It covers only 1% of the overall area (assuming it’s uniquely textured – such as a map).

Incremental tracking using optical flow:

These requirements are driven from industrial needs: “more beautiful markers” and of course making them more robust.


Daniel continues with the next discussion about Natural feature tracking on mobile phones.

Compared with marker tracking, natural feature tracking is less robust, more knowledge about the scene, more memory, better cameras, more computational load…

To make things worse, mobile phones have less memory, with less processing power (and no floating point computation), and a low camera resolution…

The result is that a high end cell phone runs x10 slower than a PC, and it’s not going to improve soon, because the battery power is limiting the advancement of this capabilities.

So what to do?

We looked at two approaches:

  • SIFT (one of the best object recognition engines – though slow) and –
  • Ferns (state of the art for fast pose tracking – but is very memory intensive)

So both approaches wont work for cell phones…

The solution: combine the best of both worlds into what they call: PhonySift (Modified SIFT for phones). And then complementing it with PhonyFern – detecting dominant orientation and predicting where the feature will be in the next frame.

Conclusion: both approaches did eventually work on mobile phones in an acceptable fashion. The combined strength made it work, and now both Fern and Sift work at similar speeds and memory usages.


From ISMAR ’08 Program:

  • Multiple 3D Object Tracking for Augmented Reality
    Youngmin Park, Vincent Lepetit, Woontack Woo
  • Robust and Unobtrusive Marker Tracking on Mobile Phones
    Daniel Wagner, Tobias Langlotz, Dieter Schmalstieg
  • Pose Tracking from Natural Features on Mobile Phones
    Daniel Wagner, Gerhard Reitmayr, Alessandro Mulloni, Tom Drummond, Dieter Schmalstieg

Live from ISMAR ’08: Near-Eye Displays – a Look into the Christmas Ball

The third day of ISMAR ’08, the world’s best augmented reality event, is unfolding with what we expect to be an eye popping keynote (pun intended) by Rolf R. Hainich, author of The End of Hardware.

He is introduced as an independent research and started to work on AR in the early ’90s – so he could be considered as a pioneer…

A question on everyone’s mind is: Why Christmas ball and not a Crystal ball?

Rolf jumps on stage and starts with a quick answer: Christmas balls can help produce concave mirrors – useful for near eye displays.

First near eye display was created in 1968 by Ivan Sutherland; in 1993 an HMD for out of cockpit view was built in a Tornado simulator. In 2008, we see multiple products such as NVIS, Zeiss HOE glasses, Lumus, Microvision, but Rolf doesn’t consider them as true products for consumers.

Rolf ,defined the requirements for a near eye display back in 1994. It included: Eye tracker, camera based position sensing, dynamic image generator, registration, mask display, holographic optics. And don’t forget no screws, handles, straps ,etc…

He then presents several visions of the future of human machine interaction which he dubs 3D operating system.Then he briefly touches on the importance of sound, economy and ecology – and how near eye displays could save so much hardware, power, and help protect the environment.

But it requires significant investment. This investment will come from home and office applications (because of economies of scale- other markets such as military, medical, etc – will remain niche markets.

The next argument relates to the technology: Rolf gives examples of products such as memory, displays, cell phones, cameras which experienced dramatic improvements and miniaturization over the last years. And here is the plug for his famous joke: Today, I could tape cell phones on my eyes and they would be lighter than the glasses I use to wear 10 years ago…

Now, he schemes through different optional optical designs with mirrors, deflectors, scanners, eye tracker chips, etc (which you can review in his book The End of Hardware) These design could support a potential killer app – eye operated cell phone…

Microvision website is promoting such a concept (not a product), mostly to get the attention of phone manufacturers, according to Rolf.

Rolf, then tackles mask displays, a thorny issue for AR engineers and suggests it can achieve greater results than you would expect.

Eye Tracking is necessary to adjust the display based on where the eye is pointing. It’s once thing that AR didn’t inherit from VR. But help could come from a different disciplines – computer mouse which have become pretty good at tracking motion.

Other considerations such as Aperture, focus adjustment (should be mechanical), eye controller, are all solvable in Rolf’s book.

Squint and Touch – we usually look where we want to touch, so by following the eye we could simplify the user interface significantly.

Confused? Rolf is just getting started and dives effortlessly into lasers, describing what exists and what needs to be done. It should be pretty simple to use. And if it’s not enough, holographic displays could do the job. Rolf has the formulas. It’s just a matter of building it.

he now takes a step back and looking at the social impact of this new technology: when everybody “wears” anybody can be observed. The big brother raises its ugly head. Privacy is undermined, Copyright issues get out of control. But…resistance is futile.

Rolf wraps up with a quick rewind and fast forward describing the technology ages: PC emerged in the 80’s, AR in the 2020’s, and chip implants (Matrix style) will rule in the 2050.

Question: It didn’t look like the end of hardware…

Rolf: it’s the end of the conventional hardware – we will still have hardware but it could be 1000 times lighter.

Tom Drummond (from the audience): there is still quite a lot of work to get these displays done and there is still some consumer resistance to put on these head up displays…

Rolf: People wear glasses even for the disco – it’s a matter of fashion and of making it light – with the right functionality.


From the ISMAR ’08 Program:

Speaker: Rolf R. Hainich, Hainich&Partner, Berlin

We first have a look at the development of AR in the recent 15 years and its current state. Given recent advances in computing and micro system technologies, it is hardly conceivable why AR technology should not finally be entering into mass market applications, the only way to amortize the development of such a complex technology. Nevertheless, achieving a ‘critical mass’ of working detail solutions for a complete product will still be a paramount effort, especially concerning hardware. Addressing this central issue, the current status of hardware technologies is reviewed, including micro systems, micro mechanics and special optics, the requirements and components needed for a complete system, and possible solutions providing successful applications that could catalyze the evolution towards full fledged, imperceptible, private near eye display and sensorial interface systems, allowing for the everyday use of virtual objects and devices greatly exceeding the capabilities of any physical archetypes.

Live from ISMAR ’08: Latest and Greatest in Augmented Reality Applications

It’s getting late in the second day of ISMAR ’08 and things are heating up…the current session is about my favorite topic: Augmented Reality applications.

Unfortunately, I missed the first talk (had a brilliant interview with Mark Bullinghurst) by Raphael Grasset about the Design of a Mixed-Reality Book: Is It Still a Real Book?

I will do my best to catch up.

Next, Tsutomu Miyashita and Peter Meier (Metaio) are on stage to present an exciting project that games alfresco covered in our Museum roundup: An Augmented Reality Museum Guide a result of a partnership between Louvre-DNP Museum lab and Metaio.

Miyashita introduces the project and describes the two main principles of this application are Works appreciation and guidance.

Peter describes the technology requirements:

  • guide the user through the exhibition and provide added value to the exhibitions
  • integrate with an audio guide service
  • no markers or large area trackin – only optical and mobile trackers

Technology used was Metaio’s Unifeye SDK, with a special program developed for the museum guide. Additional standard tools (such as Maia) were used for the modeling. All the 3d models were loaded on the mobile device. The location recognition was performed based on the approach introduced by Reitmayr and Drummond: Robust model based outdoor augmented reality (ISMAR 2006)

600 people experienced the “work appreciation” and 300 people the guidance application.

The visitors responses ranged from “what’s going on?” to “this is amazing!”.

In web terms, the AR application created a higher level of “stickiness”. Users came back to see the art work and many took pictures of the exhibits. The computer graphics definitely captured the attention of users. It especially appealed to young visitors.

The guidance application got high marks : ” I knew where I had to go”, but on the flip side, the device was too heavy…

In conclusion, in this broad exposure of augmented reality to a wide audience, the reaction was mostly positive. it was a “good” surprise from the new experience. Because this technology is so new to visitors, there is a need to keep making it more and more intuitive.


Third and last for this session is John Quarles discussing A Mixed Reality System for Enabling Collocated After Action Review (AAMVID)

Augmented reality is a great too for Training.

Case in point: Anesthesia education – keeping the patient asleep through anesthetic substance.

How cold we use AR to help educate the students on this task?

After action review is used in the military for ages: discussing after performing a task what happened? how did I do? what can I do better?

AR can provide two functions: review a fault test + provide directed instruction repetition.

With playback controls on a magic lens, the student can review her own actions, see the expert actions in the same situation, while viewing extra information about how the machine works (e.g. flow of liquids in tubes) – which is essentially real time abstract simulation of the machine.

The result of a study with testers showed that users prefer Expert Tutorial Mode which collocates expert log with realtime interaction.

Educators, on the other hand, can Identify trends in the class and modify the course accordingly.
Using “Gaze mapping” the educator can see where many students are pointing their magic lens and unearth an issue that requires a different teaching method. In addition, educators can see statistics of student interactions.

Did students prefer the “magic lens” or a desktop?

Desktop was good for personal review (afterward) which the Magic lens was better for external review.

The conclusion is that an after action review using AR works. Plus it’s a novel assessment tool for educators.

And the punch line: John Quarles would have killed to have such an After action review to help him practice for this talk…:-)


From ISMAR ’08 Program:


  • Design of a Mixed-Reality Book: Is It Still a Real Book?
    Raphael Grasset, Andreas Duenser, Mark Billinghurst
  • An Augmented Reality Museum Guide
    Tsutomu Miyashita, Peter Georg Meier, Tomoya Tachikawa, Stephanie Orlic, Tobias Eble, Volker Scholz, Andreas Gapel, Oliver Gerl, Stanimir Arnaudov, Sebastian Lieberknecht
  • A Mixed Reality System for Enabling Collocated After Action Review
    John Quarles, Samsun Lampotang, Ira Fischler, Paul Fishwick, Benjamin Lok

Live from ISMAR ’08: The dARk side of Physical Gaming

Welcome to the late evening keynote of the second day of ISMAR ’08 in Cambridge.

The keynote speaker is Diarmid Campbell, from Sony Computer Entertainment Europe (London), and heads its research on camera gaming. And we are covering it in real time.

Diarmid comes on stage. the crowed is going crazy…

The talk: Out of the lab and into the living room

What a camera game? Simply put, you see yourself in the camera and add graphics on top.

The trouble with the brain: it fixes things you see (example of a checkerboard, a black square in the light has the same color as a white square in the dark.)

Background subtraction is the first thing you try to do. Using this technique, Diarmid superimposes him self in real time on top of…the ’70 super band ABBA…

User interface motion buttons – use virtual buttons that the user activates. The response is not as robust, but it’s more responsive.

Example of EyeToy Kinetic

Next is a demonstration of vector buttons and optical flow.

You have to keep the control on the side – otherwise the player’s body will activate it unintentionally.

It turns out Sony decided not to use this control…not just yet.

A similar control was actually published in Creature Adventures available online. Diarmid struggles with it. The crowed goes wild. Diarmid: “You get the idea…”

Good input device characteristics: Many degrees of freedom, non-abstract (player action=game action), robust and responsive.

Camera games have been accused in the past for not having depth (too repetitive). There are 2 game mechanics: skill based (shoot the bad guy) and puzzle based. This could become shallow – unless you deliver on the responsiveness and robustness.

To demonstrate color tracking, Diarmid dives into the next demo (to the pleasure of the audience…). For this demo he holds 2 cheerleader pompoms…

“It’s like a dance dance revolution game, so I also have to sing and occasionally shout out party…”

The crowd is on the floor.

See for yourself –

We are on to drawing games, Sketch Tech. He draws a cow that is supposed to land on a banana shaped moon. He succeeds!

Using a face detector from Japan, here is a Head Tracking game: a green ball hangs from his mouth (a pendulum) and with circular moves of his head he rotates it, while trying to balance it…

Eye of judgment, a game that came out last year (bought out by Sony) relied on a marker based augmented reality technology. It is similar to a memory game, with a camera and a computer, and cards.

We are starting to wrap up and Diarmid summarizes, credits Pierre for setting up all the hardware, and opens the floor for questions.

Question: How do you make the game interesting when you’re doing similar gestures over and over again…

Diarmid: When the game is robust and responsive – you’ll be surprised how long you can play the game and try to be better.

Blair MacIntyre (from the audience): Robust and learn-able is what makes the game fun over time.

Question: Is there anything more you can tell us about the depth camera? Will it be available soon to consumers?

Diarmid: No.

The crowed bursts into loughs.

Blair (jumps in from the audience) There is a company called 3dv in Israel which offers such a camera. It’s not cheap or as good as discussed before, but you can get it.

Q: What’s special about camera games beyond novelty?

Diarmid: The 2 novel aspects of camera games are that it allows you to see yourself, and you can avoid the controller. Camera games are also great for multi-players.

Q: Is there a dream game you’d like to see?

Diarmid: Wow, that’s hard…I worked on a game before Sony called The Thing based on Carpenter’s movie. It was all about trust. The camera suddenly opens up the ability to play with that. When people see each other, the person to person interaction is very interesting and hasn’t been explored in games.

Q: will we see camera games on PSP?

Diarmid: there is a game in development, and I don’t know if I can talk about it…

Q: when I look in the mirror I am not so comfortable with what I see…how do you handle that?

Diarmid:  We flip the image. It’s hard to handle a ball, when just looking at the mirror.

And that’s a wrap! Standing ovation.


After party shots…