[IEEE 2013 World Haptics Conference (WHC 2013) - Daejeon (2013.4.14-2013.4.17)] 2013 World Haptics Conference (WHC) - MPEG-V standardization for haptically interacting with virtual worlds
Post on 12-Apr-2017
MPEG-V Standardization for Haptically Interacting with Virtual WorldsJaeha Kim, Yeongmi Kim, Jeha Ryu*
Gwangju Institute of Science and Technology
ABSTRACT With rapid developments in virtual reality (VR), haptics, communications technology and digital multimedia, there has been increasing demand for creation and widespread development of more realistic or immersive display systems. Consumers are eager to experience engrossing contents capable of providing presence, beyond traditional audio-visual media. In response to this demand, tactile and haptic interaction is also becoming increasingly important. Recently, MPEG-V standard (ISO/IEC 23005) has been published. The MPEG-V provides architecture and metadata for interaction and interoperability between virtual worlds and real worlds through various sensors and actuators. The aim of this paper is to provide an overview of MPEG-V standard, particularly regarding haptic and tactile interactions. KEYWORDS: Haptics, virtual environments, Standardization, MPEG-V INDEX TERMS: H.5.2 [Information Interfaces and Presentation]: User InterfacesHaptic I/O
1 INTRODUCTION In recent years, technology that is able to provide users more realistic and engaging feeling of being there, known as presence, has been dramatically developed, integrating existing and emerging other technology such as UHD video, 3DTV, VR, haptics, and so on. For example, some theme parks and movie theater provide immersive and interactive entertainment to the public. Disneyland and Universal Studios provide 4D contents for a more realistic experience, incorporating 3D stereoscopic pictures, multichannel audio effects, and some tactile effects such as vibrating chairs synchronized with a melody or with the context. Reflecting these developments, demand for tactile and haptic interaction has also been increasing. In addition to conventional media such as images, video, and audio, haptics is now expected to play a prominent role in various applications to provide immersiveness .
Many researchers realized the need of a formal description for haptic interaction, and haptic-related applications. In order to systematically describe various elements related to a haptic application, some researchers have proposed a structured data format. Carter et al.  proposed a XML-based approach to represent generic haptic application. Cha et al.  extended MPEG-4 Binary Format for Scenes (BIFS)  to support the representation of synchronization and the transmission of haptic data and audio-visual media. El-Far et al.  proposed HAML, which is an XML-based language intended to describe haptic-related information, including haptic devices, haptic development APIs, virtual environments, and communication.
Based on these existing studies, a lot of progress has been made in international standardization for haptic interactions. With respect to haptic technology, there are two international standards that are currently in progress. A new set of ISO standards is being developed as a set of new parts of ISO 9241 Ergonomics of Human-System Interaction by a working group of ISO TC159/SC4/WG9 (Tactile and Haptic Interactions). ISO TC159/SC4/WG9 provides comprehensive guidelines for haptic interactions; it provides ergonomic requirements and recommendations for haptic and tactile hardware and software interactions, including guidance related to the design and evaluation of hardware, software, and combinations of hardware and software interactions. For detailed information, see .
The other haptic-related international standard is the MPEG-V, from Moving Picture Experts Group (MPEG, ISO/IEC JTC1/SC29/WG11). The MPEG-V is a standard for interfacing with virtual worlds and real worlds through various sensors and actuators, including tactile and kinesthetic devices . In this paper, we introduce the state of the art of the MPEG-V standard focusing on haptic technology and show several use case scenarios.
The remainder of the paper is organized as follows: in section 2, we introduce the MPEG-V standard and its scope and features. Section 3 covers the details of the MPEG-V, mainly focusing on haptic-related features. In section 4, several use case scenarios with haptic information are presented to describe how haptic contents can be applied in MPEG-V. Some related issues and research challenges are discussed in section 5.
2 MPEG-V: MEDIA CONTEXT AND CONTROL The MPEG is a working group of experts to set standards for various technologies related to audio and video compression and transmission. In these days, many people tend to enjoy a more immersive experience beyond only visual or auditory senses. Accordingly, multi-modal multimedia content is coming into the spotlight, and interaction and interoperability between multimedia content and various sensors and actuators are becoming important issues. In recent years, the MPEG-V (ISO/IEC 23005) standard was established. The MPEG-V started in 2007, and it has been accepted by the ISO members as an International Standard and is published by ISO in 2011. This standard provides architecture  (see Figure 1) and associated information representations for interaction and interoperability between virtual worlds and real worlds through various sensors and actuators.
The MPEG-V standard is divided into seven parts. Part 1 deals with the MPEG-V system architecture and the overall scope, and introduces various application scenarios. Part 2 provides control information for manipulating devices in the real world as well as in virtual worlds. The adaptation engine (RV or VR engine, shown in Figure 1), which is not within the scope of the MPEG-V, is an interface that enables to not only communicate but also calibrate the sensory information based on the users sensory effect preferences, sensory device capabilities, sensor capabilities, and sensor adaptation preferences between the virtual world and real world. It takes six inputssensory effects (SE), the users sensory effect preferences (USEP), sensory device capabilities (SDC), sensor capability (SC), sensor adaptation preference (SAP), and sensed information (SI)and then outputs sensory
*e-mail:firstname.lastname@example.org (corresponding author). Human-Machine-Computer Interface Laboratory, GIST, Gwangju, Korea
IEEE World Haptics Conference 201314-18 April, Daejeon, Korea978-1-4799-0088-6/13/$31.00 2013 IEEE
device commands (SDC) to control external devices in the real world or to manipulate and feel the virtual world objects. The scope of part 2 covers the interfaces between the adaptation engine and the capability descriptions of various sensors and actuators in the real world, and preference information. Part 3 defines various sensory effects that a content creator might wish to provide to the user; the number of these effects is very large and growing. This part is applicable to enhance the experience of users while consuming diverse types of media resources. For instance, they include light, temperature, wind, vibration, scent, tactile, and kinesthetic effects, and so on. Part 4 deals with various properties a virtual object or avatar can have, such as appearance, animation, scent, sound, and haptic. It also includes information related to manipulation of virtual objects (or avatars), such as scaling and setting up position and orientation, and so on. Part 5 deals with data formats for interaction devices, such as device commands and sensed information. This part provides data format for industry-ready interaction devices including various sensors and actuators. Part 6 provides common data types and tools, and part 7 provides conformance and reference software. Note that the adaptation engine (RV or VR engine) is not within the scope of the MPEG-V standard. Please see [9-15] for more details.
With respect to haptic applications, the MPEG-V can be used for describing material properties such as stiffness, damping, friction, and mass of virtual objects by using part 4 of the standard. Haptic sensory effects (i.e. tactile/kinesthetic feedback), which reflect intention of a haptic content creator, can be described by the part 3 of the standard. With part 2, it is possible to describe specifications, capabilities, and users preferences of various sensors and actuators such as force and torque sensor, encoder, tachometer, accelerator, vibrotactile motor, peltier device, kinesthetic device, and so on. Part 5 can be used for operating various haptic (tactile/kinesthetic) devices based on sensed data. In the following section, the details of the MPEG-V will be discussed, mainly focusing on haptics.
3 MPEG-V FRAMEWORK FOR HAPTICS In this section, we concentrate upon the MPEG-V standard, placing emphasis on haptic-related parts. The contents associated with haptic technology can be found all over the parts of the standard. In part 1, a basic concept and several use case scenarios with haptic information are presented in order to describe how haptic contents can be applied in the MPEG-V through kinesthetic and tactile devices . Part 2 provides control information for various device and sensor capability types, as well as various users sensory and sensor adaptation preferences. For instance, TactileCapabilityType, KinestheticCapabilityType are tools for describing capabilities of tactile and kinesthetic effect, and TactilePrefType, KinestheticPrefType are used to describe corresponding preferences, respectively . As a part of sensory information, TactileType and Passive/ActiveKinestheticType are defined in part 3 . TactileType is a type, which can provide vibration, pressure, temperature, etc., directly onto some areas of human skin through many types of actuators such as motors, piezo-actuators, thermal actuators. In PassiveKinestheticMotion-Type and PassiveKinestheticForceType mode, kinesthetic device can guide user according to the recorded motion trajectories or the recorded force/torque histories, respectively. In Active-KinestheticType provides interaction mode; a user can touch an object by his/her will, then the computed forces and torques are provided. Part 4 defines various haptic data such as haptic material properties, dynamic force effects, and tactile properties . MaterialPropertyType contains the descriptions of a material property associated to the virtual world object. It includes stiffness, damping, static/dynamic friction, mass, and texture. DynamicForceEffectType can be used to describe force field effect or haptic guidance on the recorded motion trajectory. TactileType describes tactile properties such as temperature, vibration, and tactile patterns, associated to the virtual world object. In part 5, data formats for exchanging information for interaction devices are defined . TactileType, KinestheticType, and various sensor types such as PositionSensorType, VelocitySensorType, ForceSensorType, and TorqueSensorType are defined in this part, regarding haptic interaction. As mentioned earlier, Part 6 provides common data types and tools  and part 7 provides conformance and reference software .
In the following sections, it will be explained how the MPEG-V can be used for editing, authoring and rendering haptic content.
3.1 Editing and Authoring Haptic Content Using MPEG-V
A virtual world may consist of many types of audio-visual media sources such as 3D models, images, sound effects, text etc. Haptic media can be in harmony with those resources in order to produce more meaningful and immersive content. In this respect, haptic content can be defined as a data set that includes synthesized and closely synchronized information about haptic and audio-visual data. This concept is depicted in Figure 2.
Similar to creating audio-visual media for example, designing 3D model by modeling tools such as 3DS-MAX and Blender, and editing video with Adobe Premiere, haptic data can be created using modeling and authoring tools [18-21]. The MPEG-V can help the haptic content creator to represent haptic data in a proper format in editing (modeling) and authoring stages. For example, TactileType in part 3 provides viewers tactile feedback in a series of scenes. A tactile effect may effectively be represented by an ArrayIntensity or a TactileVideo, or by audio file(s) containing the waveform of the tactile effect. Figure 3 shows the use of TactileVideo for authoring vibrotactile effects onto some scenes.
Haptic content may include the geometry of an object or refer to its location (e.g., using a URI), to provide visual cues as well as
Figure 1. System architecture of the MPEG-V : RV: Real-to-
Virtual, VR: Virtual-to-Real, RW: Real World, VW: Virtual World. The numbers in brackets refer to specific parts of the standard
haptic sensations for the virtual objects that the user touches. Alternatively, haptic data may use 2.5D depth images that can be obtained using Z-cam  or KINECT .
After acquiring the geometry of a virtual object, various haptic material properties can be added to the geometry models through a series of editing (or modeling) processes to enable viewers feel the given object. This process can be carried out using a haptic modeler such as HAMLAT  or K-Haptic modeler . Moreover, a haptic content creator can add dynamic force effects
for manipulating the virtual object. Haptic contents can also include motion data that can be acquired from motion capture devices; the trajectory of the captured motion data can be rendered with an appropriate kinesthetic feedback.
At this stage, part 4 of the MPEG-V can be used in order to describe a virtual world and virtual objects in a scene with their visual and haptic material properties. In order to allow viewers feel the shape of the virtual object, the object should have a geometry model (Appearance) and its properties (MaterialPropertyType) such as stiffness, friction, damping, mass, texture, and so on, as shown in Figure 4.
Regarding creating the haptic content based on the MPEG-V, examples of haptic authoring tool and haptic modeler will be given in section 4.1 and 4.2.
3.2 Rendering of Haptic Content The created haptic contents as a result of modeling and authoring stage are appropriately processed in the adaptation engine, which can parse and analyse the content in accordance with Sensory Device Capability, Sensor Capability, and Users Sensory Effect Preference of part 2 of the MPEG-V standard. Finally, the adaptation engine generates Device Commands (in part 5 of the standard) to set various sensors and actuators into operation, and they control the haptic devices.
Aforementioned, the adaptation engine is not within the scope of the MPEG-V. In other words, the MPEG-V does not deal with processing and displaying the created haptic content together with
Figure 3. Authoring vibrotactile effects by TactileType of part 3 of
Figure 4. Modeling virtual object with various haptic properties using