Guest Post: Presence – The Powerful Sensation Making Virtual Reality Ready for Primetime

Evolving the Human Machine Interface Part III

The concept of Presence in Virtual Reality (VR) has been gaining popularity over the past year, particularly within the gaming community. With consumer VR devices in development from Oculus, Sony, and more than likely Microsoft, Presence has become the metric by which we evaluate all VR experiences. But Presence is difficult to describe to someone who has never tried VR. “It’s like I was actually there. It made me feel like what I was seeing was actually happening to me, as though I was experiencing it for real,” is how one colleague described the experience.

Presence in VR triggers the same physical and emotional responses one would normally associate with a real world situation, it is the wonderfully magical experience of VR. But how is Presence achieved? While many research studies have provided a variety of subjective descriptions for Presence, there seem to be 3 common variables that affect tele or virtual Presence most:

1.  Sensory input: at minimum the ability to display spatial awareness

2.  Control: the ability to modify one’s view and interact with the environment

3.  Cognition: our individual ability to process and respond to sensory input

Because the nature of VR isolates the user from real world visual input, if the device’s sensory input and control are inadequate or missing then the effect of Presence fails, the results of which are oftentimes met will ill side effects: You feel sick to your stomach!

Sensory Input

For those who have tried VR, at some point or another you’ve felt queasy. That point which the experience turns from wonder to whoa, has been an unfortunate side effect throughout the development of VR. As Michael Abrash presented at Valve’s Steam Dev days, the hurdles needed to overcome VR sickness and achieve Presence are within reach. In the video below, Michael expertly summarizes the technical hurdles to achieve a believable sense of Presence in VR.

“What VR Could, Should, and Almost Certainly Will Be within Two Years” Steam Dev Days 2014

Michael Abrash, Valve Softwrae

Control

To achieve a minimum level of Presence, head tracking is used to display the VR image to match the users own head position and orientation. While this helps to create the sense of spatial awareness within VR, it still doesn’t make someone’s experience truly “Present.” To do that, we need to add the representation of ourselves, either through an avatar or our own body image. Viewing our physical presence in VR, known as body awareness, creates an instant sense of scale and helps to ground the user within the experience. In the video sample below, Untold Games is creating body awareness through avatar control.

“Loading Human, true body awareness” Untold Games 2014

Without body awareness, VR can feel more like an out-of-body experience. Everything “looks” real and the user has spatial awareness, but the user’s body and movements are not reflected therefore the user does not feel actually present. Combining body awareness with VR’s spatial awareness creates a strong bond between the user and the experience.

Cognition

The third perimeter of Presence is us. The feeling of Presence in VR is directly influenced by our personal ability to process and react to environmental changes in the real world. It’s likely that many of us will not have the same reactions to the experiences within VR. If you get sick riding in cars easily, then VR motion will give you the same sensation. If you’re afraid of heights, fire, spiders, etc. you’re going to have the same strong reactions and feelings in VR. Our individual real life experience influences our perception and reactions to VR. This can lead to some interesting situations, in particular with gaming. For example one player may be relatively unaffected by a situation or challenge, while another may be strongly affected.

Obviously the conditions of Presence are perceptual only. In most cases we’re not at the same physical risk in virtual environments as we would be in real life. But our own cognition coupled with VR’s ability to create Presence is why VR is such a popular field for everything from gaming and entertainment to therapy and rehabilitation.

Once we start to overcome these technical hurdles and provide a basic level of Presence, we next need to understand what it will ultimately enable. What does Presence provide for us in an experience other than merely perceiving the experience as real-like? We’ll explore that idea in the next segment, and try to understand where Presence will have the most impact.

Guest Post: Harnessing the Power of Human Vision

Harnessing the Power of Human Vision

By Mike Nichols, VP Content and Applications at SoftKinetic

For some time now, we’ve been in the midst of a transition away from computing on a single screen. Advances in technology, combined with the accessibility of the touch-based Human Machine Interface (HMI) have enabled mobile computing to explode. This trend will undoubtedly continue to evolve as we segue into more wearable Augmented Reality (AR) and Virtual Reality (VR) technologies. While AR and VR may provide substantially different experiences to their flat screen contemporaries, both AR and VR face similar issues of usability. Specifically, how and what are the most accessible ways to interact using these devices?

The history of HMI development for both AR and VR has iterated along similar paths of using physical controllers to provide user navigation. Although the use of physical controls has been a necessity in the past, if they remain the primary input, these tethered devices will only serve as shackles that prevent AR and VR from reaching full potential as wearable devices. While physical control devices can and do undoubtedly add a feeling of immersion to an experience, in particular with gaming, you would no more want a smart-phone that was only controllable via a special glove, then you would want to control your smart-glass through a tethered controller. As the technology for AR and VR continues to evolve it will eventually need embedded visual and audio sensors to support the HMI. In particular, visual sensors to support a full suite of human interactions that will integrate with our daily activities in a more natural and seamless way than our mobile devices do today.

In Depth

A depth sensor is the single most transformative technology for AR and VR displays because it is able to see the environment as 3-dimensional data, much like you or I do with our own eyes. It’s the key piece of technology that provides us with the building blocks needed to interact with our environments – virtual or otherwise. The depth sensor allows us to reach out and manipulate virtual objects and UI by tracking our hands and fingers

A scenes’ depth information can be used for surface and object detection, then overlaid with graphics displayed relative to any surface at the correct perspective to our heads’ position and angle. Depth recognition combined with AR and VR presents a profound change from the way we receive and interact with our 2D digital sources today. To simulate this effect, the video below shows an example of how a process known as projection mapping can transform even simple white cards in astonishing ways.

“Box” Bot & Dolly

It’s not hard to imagine how AR combined with depth can be used to transform our view of the world around us. To not only augment our world view with information, but even transform live entertainment such as theater, concerts, sporting events, even photography and more.

Take a more common example like navigation. Today, when we use our smart phones or GPS devices to navigate, our brain has to translate the 2D information on the screen into the real world. Transference of information from one context to another is a learned activity and often confusing for many people. We’ve all missed a turn from time-to-time and blamed the GPS for confusing directions. In contrast, when navigating with depth-enabled AR glasses the path will be displayed as if being projected into the environment, not overlaid on a flat simulated screen. Displaying projected graphics mapped to our environment creates more context aware interactions, and becomes easier to parse relevant information based on distance and view angle.

Bridge the gap

As we look to the future of AR and VR they will both certainly require new approaches to enable an accessible HMI. But that won’t happen overnight. With commercialized VR products from the likes of Oculus, Sony and more coming soon we’ll have to support an interactive bridge to a new HMI through existing controllers. Both Sony and Microsoft already offer depth cameras for their systems that support depth recognition and human tracking. The new Oculus development kit includes a camera for tracking head position.

We’re going to learn a lot about what interactions work well and those that do not over the next few years. With technology advances still a ways off to make commercial AR glass feasible as a mass market option, it’s even more important to learn from VR. Everything done to make VR more accessible will make AR better.

Stay tuned for our next guest post, where we’ll take a closer look at how depth will provide a deeper and more connected experience.

Guest Post: Evolving the Human Machine Interface

How the World Is Finally Ready For Virtual and Augmented Reality

By Mike Nichols, VP, Content and Applications at SoftKinetic

The year is 1979 and Richard Bolt, a student at MIT, demonstrates a program that enables the control of a graphic interface by combining both speech and gesture recognition. As the video of his thesis below demonstrates, Richard points at a projected screen image and issues a variety of verbal commands like “put that there”, to control the placement of images within a graphical interface in what he calls a “natural user modality”.

“Put-That-There”: Voice and Gesture at the Graphics Interface
Richard A. Bolt, Architecture Machine Group
Massachusetts Institute of Technology – under contract with the Cybernetics Technology Division of the Defense Advanced Research Projects Agency, 1979.

What Bolt demonstrated in 1979 was the first natural user interface. A simple pointing gesture combined with a verbal command, while an innate task in human communication, was and still is difficult for machines to understand correctly. It would take another 30 years for a consumer product to appear that might just fulfill that vision.

A new direction

In the years following Richard’s research, technology would advance to offer another choice to improve the Human Machine Interface (HMI). By the mid 80’s the mouse, a pointing device for 2D screen navigation, had evolved to provide an accurate, cost effective, and convenient method for navigating a graphical interface. Popularized by Apple’s Lisa and Macintosh computers, and supported by the largest software developer Microsoft, the mouse would become the primary input for computer navigation over the next 20 years.

Mac-lisa

“The Macintosh uses an experimental pointing device called a ‘mouse’. There is no evidence that people want to use these things.”

San Francisco Examiner, John C. Dvorak – image provided by…

 

In 2007, technology advancements helped Apple once again popularize an equally controversial device, the iPhone. With its touch sensitive screen and gesture recognition, the touch interface in all its forms has now become the dominant form of HMI.

The rebirth of natural gesture

Although seemingly dormant throughout the 80’s and 90’s, research continued to refine a variety of methods for depth and gesture recognition. In 2003 Sony released the Eye Toy for use with the PlayStation2. The Eye Toy enabled Augmented Reality (AR) experiences and could track simple body motions. Then in 2005 Nintendo premiered a new console, the Wii, which used infrared in combination with handheld controllers to detect hand motions for video games. The Wii controllers, with their improved precision over Sony’s Eye Toy, proved wildly successful and set the stage for the next evolution in natural gesture.

In 2009 Microsoft announced the Kinect for Xbox 360, with its ability to read human motions to control our games and media user interface (UI), without the aid of physical controllers.

What Richard Bolt had demonstrated some 30+ years prior was finally within grasp. Since the premier of Kinect we’ve seen more progress in the development of computer vision and recognition technologies than in the previous 35 years combined. Products like the Asus Xtion, Creative Senz3D, and Leap Motion have inspired an energetic global community of developers to create countless experiences across a broad spectrum of use cases.

The future’s so bright

To this day, Richard’s research speaks to the core of what natural gesture technology aims to achieve, that “natural user modality”. While advances in HMI have continued to iterate and improve over time, the medium for our visual interaction has remained relatively intact: the screen. Navigation of our modern UI has been forced to work within the limits of the 2D screen. With the emergence of AR and VR, our traditional forms of HMI do not provide the same accessible input as the mouse and touch interfaces of the past. Our HMI must evolve to allow users the ability to interact to the scene and not the screen.

CES 2014, Road to VR, SoftKinetic premiers hand and finger tracking for VR.

 

Next, we’ll explore how sensors, not controllers, will provide the “natural user modality” that will propel AR and VR to become more pervasive than mobile is today. The answer, it seems, may be right in front of us…we just need to reach out and grab it.