Download - Natural Interaction for Augmented Reality Applications

Natural Interaction for Augmented Reality Applications

Mark Billinghurst

[email protected]

The HIT Lab NZ, University of Canterbury

November 28th 2013

1977 – Star Wars

1977 – Star Wars

Augmented Reality Definition   Defining Characteristics

 Combines Real and Virtual Images -  Both can be seen at the same time

  Interactive in real-time -  The virtual content can be interacted with

  Registered in 3D -  Virtual objects appear fixed in space

Azuma, R. T. (1997). A survey of augmented reality. Presence, 6(4), 355-385.

Augmented Reality Today

  Key Question: How should a person interact with the Augmented Reality content?  Connecting physical and virtual with interaction

Physical Elements

Virtual Elements Interaction

Metaphor Input Output

AR Interface Components

AR Interaction Metaphors   Information Browsing

  View AR content

  3D AR Interfaces   3D UI interaction techniques

  Augmented Surfaces   Tangible UI techniques

  Tangible AR   Tangible UI input + AR output

Tangible User Interfaces   Use physical objects to

interact with digital content   Foreground

  graspable user interface

  Background   ambient interfaces

Ishii, H., & Ullmer, B. (1997). Tangible bits: towards seamless interfaces between people, bits and atoms. In Proceedings of the ACM SIGCHI Conference on Human factors in computing systems (pp. 234-241). ACM.

TUI Benefits and Limitations   Pros

  Physical objects make us smart   Objects aid collaboration   Objects increase understanding

  Cons   Difficult to change object properties   Limited display capabilities – 2D view   Separation between object and display

Tangible AR Metaphor   AR overcomes limitation of TUIs

  enhance display possibilities  merge task/display space   provide public and private views

  TUI + AR = Tangible AR   Apply TUI methods to AR interface design

VOMAR Demo (Kato 2000)   AR Furniture Arranging

  Elements + Interactions   Book:

-  Turn over the page   Paddle:

-  Push, shake, incline, hit, scoop

Kato, H., Billinghurst, M., et al. 2000. Virtual Object Manipulation on a Table-Top AR Environment. In Proceedings of the International Symposium on Augmented Reality (ISAR 2000), Munich, Germany, 111--119.

Lessons Learned  Advantages

  Intuitive interaction, ease of use  Full 6 DOF manipulation

 Disadvantages  Marker based tracking

-  occlusion, limited tracking range, etc

 Needs external interface objects -  Paddle, book, etc

2012 – Iron Man

To Make the Vision Real..  Hardware/software requirements

 Contact lens displays  Free space hand/body tracking  Speech/gesture recognition  Etc..

 Most importantly  Usability/User Experience

Natural Interaction   Automatically detecting real environment

  Environmental awareness, Physically based interaction

  Gesture interaction   Free-hand interaction

  Multimodal input   Speech and gesture interaction

  Intelligent interfaces   Implicit rather than Explicit interaction

Environmental Awareness

AR MicroMachines   AR experience with environment awareness

and physically-based interaction   Based on MS Kinect RGB-D sensor

  Augmented environment supports   occlusion, shadows   physically-based interaction between real and

virtual objects

Clark, A., & Piumsomboon, T. (2011). A realistic augmented reality racing game using a depth-sensing camera. In Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry (pp. 499-502). ACM.

Operating Environment

Architecture   Our framework uses five libraries:

 OpenNI  OpenCV  OPIRA   Bullet Physics  OpenSceneGraph

System Flow   The system flow consists of three sections:

  Image Processing and Marker Tracking   Physics Simulation   Rendering

Physics Simulation

  Create virtual mesh over real world

  Update at 10 fps – can move real objects

  Use by physics engine for collision detection (virtual/real)

  Use by OpenScenegraph for occlusion and shadows

Rendering

Occlusion Shadows

Gesture Interaction

Natural Hand Interaction

  Using bare hands to interact with AR content  MS Kinect depth sensing   Real time hand tracking   Physics based simulation model

Hand Interaction

  Represent models as collections of spheres

  Bullet physics engine for interaction with real world

Scene Interaction

  Render AR scene with OpenSceneGraph   Using depth map for occlusion   Shadows yet to be implemented

Architecture 5. Gesture

•  Static Gestures • Dynamic Gestures •  Context based Gestures

4. Modeling

• Hand recognition/modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface


•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling


2. Segmentation


o  Supports PCL, OpenNI, OpenCV, and Kinect SDK. o  Provides access to depth, RGB, XYZRGB. o  Usage: Capturing color image, depth image and concatenated

point clouds from a single or multiple cameras o  For example:

Kinect for Xbox 360

Kinect for Windows

Asus Xtion Pro Live



4. Modeling




2. Segmentation


o  Segment images and point clouds based on color, depth and space.

o  Usage: Segmenting images or point clouds using color models, depth, or spatial properties such as location, shape and size.

o  For example:

Skin color segmentation

Depth threshold



4. Modeling




2. Segmentation


o  Identify and track objects between frames based on XYZRGB.

o  Usage: Identifying current position/orientation of the tracked object in space.

o  For example:

Training set of hand poses, colors represent unique regions of the hand.

Raw output (without-cleaning) classified on real hand input (depth image).



4. Modeling




2. Segmentation


o  Hand Recognition/Modeling   Skeleton based (for low resolution

approximation)   Model based (for more accurate

representation) o  Object Modeling (identification and tracking rigid-

body objects) o  Physical Modeling (physical interaction)

  Sphere Proxy   Model based   Mesh based

o  Usage: For general spatial interaction in AR/VR environment



4. Modeling




2. Segmentation


o  Static (hand pose recognition) o  Dynamic (meaningful movement recognition) o  Context-based gesture recognition (gestures with context,

e.g. pointing) o  Usage: Issuing commands/anticipating user intention and high

level interaction.

Skeleton Based Interaction

  3 Gear Systems   Kinect/Primesense Sensor   Two hand tracking   http://www.threegear.com

Skeleton Interaction + AR

  HMD AR View   Viewpoint tracking

  Two hand input   Skeleton interaction, occlusion

Multimodal Input

Multimodal Interaction   Combined speech input   Gesture and Speech complimentary

  Speech -  modal commands, quantities

 Gesture -  selection, motion, qualities

  Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction

Free Hand Multimodal Input

  Use free hand to interact with AR content   Recognize simple gestures

Point Move Pick/Drop

Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.

Multimodal Architecture

Multimodal Fusion

Hand Occlusion

Experimental Setup

Change object shape and colour

User Evaluation

  Change object shape, colour and position   Conditions

  Speech only, gesture only, multimodal

  Measure   performance time, error, subjective survey

Results   Average performance time (MMI, speech fastest)

  Gesture: 15.44s   Speech: 12.38s   Multimodal: 11.78s

  No difference in user errors   User subjective survey

  Q1: How natural was it to manipulate the object? -  MMI, speech significantly better

  70% preferred MMI, 25% speech only, 5% gesture only

Intelligent Interfaces

Intelligent Interfaces   Most AR systems stupid

 Don’t recognize user behaviour  Don’t provide feedback  Don’t adapt to user

  Especially important for training   Scaffolded learning  Moving beyond check-lists of actions

Intelligent Interfaces

  AR interface + intelligent tutoring system   ASPIRE constraint based system (from UC)  Constraints

-  relevance cond., satisfaction cond., feedback

Westerfield, G., Mitrovic, A., & Billinghurst, M. (2013). Intelligent Augmented Reality Training for Assembly Tasks. In Artificial Intelligence in Education (pp. 542-551). Springer Berlin Heidelberg.

Domain Ontology

Intelligent Feedback

  Actively monitors user behaviour   Implicit vs. explicit interaction

  Provides corrective feedback

Evaluation Results   16 subjects, with and without ITS   Improved task completion

  Improved learning

Intelligent Agents   AR characters

  Virtual embodiment of system  Multimodal input/output

  Examples   AR Lego, Welbo, etc  Mr Virtuoso

-  AR character more real, more fun -  On-screen 3D and AR similar in usefulness

Wagner, D., Billinghurst, M., & Schmalstieg, D. (2006). How real should virtual characters be?. In Proceedings of the 2006 ACM SIGCHI international conference on Advances in computer entertainment technology (p. 57). ACM.

Looking to the Future

What’s Next?

Directions for Future Research   Mobile Gesture Interaction

  Tablet, phone interfaces

  Wearable Systems  Google Glass

  Novel Displays  Contact lens

Mobile Gesture Interaction   Motivation

  Richer interaction with handheld devices  Natural interaction with handheld AR

  2D tracking   Finger tip tracking

  3D tracking  Hand tracking

[Hurst and Wezel 2013]

[Henrysson et al. 2007]

Henrysson, A., Marshall, J., & Billinghurst, M. (2007). Experiments in 3D interaction for mobile phone AR. In Proceedings of the 5th international conference on Computer graphics and interactive techniques in Australia and Southeast Asia (pp. 187-194). ACM.

Fingertip Based Interaction

System Setup Running System

Bai, H., Gao, L., El-Sana, J., & Billinghurst, M. (2013). Markerless 3D gesture-based interaction for handheld augmented reality interfaces. In SIGGRAPH Asia 2013 Symposium on Mobile Graphics and Interactive Applications (p. 22). ACM.

Mobile Client + PC Server

System Architecture

3D Prototype System   3 Gear + Vuforia

 Hand tracking + phone tracking

  Freehand interaction on phone   Skeleton model   3D interaction   20 fps performance

Google Glass

User Experience   Truly Wearable Computing

  Less than 46 ounces

  Hands-free Information Access   Voice interaction, Ego-vision camera

  Intuitive User Interface   Touch, Gesture, Speech, Head Motion

  Access to all Google Services   Map, Search, Location, Messaging, Email, etc

Contact Lens Display   Babak Parviz

 University Washington   MEMS components

  Transparent elements  Micro-sensors

  Challenges  Miniaturization   Assembly   Eye-safe

Contact Lens Prototype

Conclusion

Conclusions   AR experiences need new interaction methods   Enabling technologies are advancing quickly

 Displays, tracking, depth capture devices

  Natural user interfaces possible   Free hand gesture, speech, intelligence interfaces

  Important research for the future  Mobile, wearable, displays

More Information

•  Mark Billinghurst –  Email: [email protected]

– Twitter: @marknb00

•  Website –  http://www.hitlabnz.org/

Download - Natural Interaction for Augmented Reality Applications

Top Related