reconfigurable natural interaction in smart environments: approach and prototype implementation

14
ORIGINAL ARTICLE Reconfigurable natural interaction in smart environments: approach and prototype implementation Sara Bartolini Bojan Milosevic Alfredo D’Elia Elisabetta Farella Luca Benini Tullio Salmon Cinotti Received: 7 March 2011 / Accepted: 19 August 2011 / Published online: 13 September 2011 Ó Springer-Verlag London Limited 2011 Abstract The vision of sensor-driven applications that adapt to the environment hold great promise, but it is dif- ficult to turn these applications into reality because device and space heterogeneity is an obstacle to interoperability and mutual understanding of the smart devices and spaces involved. Smart Spaces provide shared knowledge about physical domains and they inherently enable cooperative and adaptable applications by keeping track of the semantic relations between objects in the environment. In this paper, the interplay between sensor-driven objects and Smart Spaces is investigated and a device with a tangible interface demonstrates the potential of the smartspacebased and sensor driven computing paradigm. The proposed device is named REGALS (Reconfigurable Gesture based Actua- tor and Low Range Smartifier). We show how, starting from an interaction model proposed by Niezen, REGALS can reconfigure itself to support different functions like Smart Space creation (also called environment smartification), interaction with heterogeneous devices and handling of semantic connections between gestures, actions, devices, and objects. This reconfiguration ability is based on the context received from the Smart Space. The paper also shows how tagged objects and natural gestures are recog- nized to improve the user experience reporting a use case and the performance evaluation of REGALS’ gesture classifier. keywords Ambient intelligence Smart Space Tangible user interface Gesture recognition 1 Introduction The quality of our life is profoundly affected by the nature of our interactions, not only with other human beings, but also with objects and environments with which we have to deal every day. If we focus on some specific categories of people, this becomes evident: elderly and disabled people are the most aware of how important can be a smooth and easy interaction with home appliances and other typical physical affordances (e.g. doors, windows) or infrastruc- tures (e.g. lights, HVAC systems) that they encounter in their routine. For such interactions to be really satisfactory and comfortable, both environments and objects should be smart, so that they can take an active role in the interaction with the humans, capable to understand the situations and react in the fastest and more appropriate way. To accom- plish that, they must be versatile to act differently according to the context. Furthermore, all interactions should minimize the need of explicit commands and should be based on the natural expressions of inter-human communication. If we look back to the 1991, the idea of people inter- acting with a Smart Environment and with objects in a seamless way was considered a vision coming from the intuition of the upcoming fast progress in communication, processing, and sensing technologies [1]. To resume this ensemble of concepts, the term Ambient Intelligence has been coined, as summarized in [2]. Today, we assist to the proliferation of conceptual models, real implementations, single applications, etc. all S. Bartolini A. D’Elia T. S. Cinotti ARCES—Universita ` di Bologna, Via Toffano 2, 40125 Bologna, Italy e-mail: [email protected] B. Milosevic E. Farella (&) L. Benini DEIS—Universita ` di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy e-mail: [email protected] 123 Pers Ubiquit Comput (2012) 16:943–956 DOI 10.1007/s00779-011-0454-5

Upload: independent

Post on 27-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

ORIGINAL ARTICLE

Reconfigurable natural interaction in smart environments:approach and prototype implementation

Sara Bartolini • Bojan Milosevic • Alfredo D’Elia •

Elisabetta Farella • Luca Benini • Tullio Salmon Cinotti

Received: 7 March 2011 / Accepted: 19 August 2011 / Published online: 13 September 2011

� Springer-Verlag London Limited 2011

Abstract The vision of sensor-driven applications that

adapt to the environment hold great promise, but it is dif-

ficult to turn these applications into reality because device

and space heterogeneity is an obstacle to interoperability

and mutual understanding of the smart devices and spaces

involved. Smart Spaces provide shared knowledge about

physical domains and they inherently enable cooperative

and adaptable applications by keeping track of the semantic

relations between objects in the environment. In this paper,

the interplay between sensor-driven objects and Smart

Spaces is investigated and a device with a tangible interface

demonstrates the potential of the smart�space�based and

sensor�driven computing paradigm. The proposed device

is named REGALS (Reconfigurable Gesture based Actua-

tor and Low Range Smartifier). We show how, starting from

an interaction model proposed by Niezen, REGALS can

reconfigure itself to support different functions like Smart

Space creation (also called environment smartification),

interaction with heterogeneous devices and handling of

semantic connections between gestures, actions, devices,

and objects. This reconfiguration ability is based on the

context received from the Smart Space. The paper also

shows how tagged objects and natural gestures are recog-

nized to improve the user experience reporting a use case

and the performance evaluation of REGALS’ gesture

classifier.

keywords Ambient intelligence � Smart Space �Tangible user interface � Gesture recognition

1 Introduction

The quality of our life is profoundly affected by the nature

of our interactions, not only with other human beings, but

also with objects and environments with which we have to

deal every day. If we focus on some specific categories of

people, this becomes evident: elderly and disabled people

are the most aware of how important can be a smooth and

easy interaction with home appliances and other typical

physical affordances (e.g. doors, windows) or infrastruc-

tures (e.g. lights, HVAC systems) that they encounter in

their routine.

For such interactions to be really satisfactory and

comfortable, both environments and objects should be

smart, so that they can take an active role in the interaction

with the humans, capable to understand the situations and

react in the fastest and more appropriate way. To accom-

plish that, they must be versatile to act differently

according to the context. Furthermore, all interactions

should minimize the need of explicit commands and should

be based on the natural expressions of inter-human

communication.

If we look back to the 1991, the idea of people inter-

acting with a Smart Environment and with objects in a

seamless way was considered a vision coming from the

intuition of the upcoming fast progress in communication,

processing, and sensing technologies [1]. To resume this

ensemble of concepts, the term Ambient Intelligence has

been coined, as summarized in [2].

Today, we assist to the proliferation of conceptual

models, real implementations, single applications, etc. all

S. Bartolini � A. D’Elia � T. S. Cinotti

ARCES—Universita di Bologna, Via Toffano 2,

40125 Bologna, Italy

e-mail: [email protected]

B. Milosevic � E. Farella (&) � L. Benini

DEIS—Universita di Bologna, Viale Risorgimento 2,

40136 Bologna, Italy

e-mail: [email protected]

123

Pers Ubiquit Comput (2012) 16:943–956

DOI 10.1007/s00779-011-0454-5

contributing to transform Monekosso’s vision in a fact.

There are many practical experiences where the space is

made able to understand what is going on in its interior by

means of suitable sensors, communicating wirelessly with

each other to form wireless sensors networks (WSN) and

improve flexibility and installation cost. Likewise, objects

can be made to think by means of embedded microcon-

trollers or smart tags (based on different technologies, such

as RFID, fiducial markers, bar codes [3]), associated with

software applications that link the object to services,

behaviors, and actions. All the interactions between

humans, devices and the smart environment can take place

via natural interfaces, exploiting voice, gestures and

expressions as the humans normally do when interacting

with each other.

However, many practical implementations lack of

abstraction, interoperability or scalability and require

considerable work to be reapplied in a different environ-

ment. In particular, limitation applies to interaction devices

and smart objects that are not thought to be universal and

programmable and are not able to behave differently

according to which appliance they are interacting with.

A step further toward the applicability of Ambient

Intelligence (AmI) in everyday life is therefore the avail-

ability of a Smart Space intended as a digital entity, a

Context Management System, where relevant real-world

information is stored and kept up to date. The Smart Space

can support upgrading, flexibility, interoperability and

modularization. New entities are added as soon as they

emerge (e.g. physical objects, digital objects, devices,

sensors, smart rooms). But more than this, the Smart Space

enables flexible handling of semantic connections and

interactions between these entities. As a consequence, an

object can have a different behavior or function depending

on the overall context of the environment, the profile of the

user that interacts with it or her/his intentions, the entity

with which is interacting with, etc.

In this work, we therefore present a combination of

middleware and smart objects that enable Ambient Intel-

ligence applications supporting gesture-based interaction.

We address the need to provide a general model and some

practical and real implementations to interact with the

Smart Space, targeting not only interaction with appliances

and technology, but also the automatic creation of the

Smart Space itself. In fact, a new solution to create auto-

matically the Smart Space (i.e. the digital representations

of real environments) starting from a physical space (a city,

a private or public space, a vehicle) is described, containing

real objects which need to be identified, located or moni-

tored (common objects, devices, furniture), calling it

‘‘Smartification’’ process [4].

Having created in a fast and easy way a Smart Envi-

ronment, it becomes possible to take advantage of the

benefits of the Smart Space, carrying out the following

information level operations:

– Association of selected properties with the state of the

objects (e.g. location, type, description)

– Object state monitoring

– Object state update and notification

The Smartification can have many different implementa-

tions, and SOFIA,1 the project supporting this research, is

addressing some of them. To prove the concept, we

introduced an example of a tangible interface that is able to

capture a number of gestures from the user, to perform

some Smartification tasks or to control other devices. Such

smart object, called REGALS (Reconfigurable Gesture

based Actuator and Low Range Smartifier), is a versatile

interaction tool and can be used in different applications,

thanks to the ability to reconfigure its functionality

according to the user preferences or the context inherited

from the Smart Space.

In Sect. 2, we provide background and related work

information on tangible interfaces and Smart Spaces. In

Sect. 3, we introduce the key concepts and the software

architecture of the SOFIA Smart Environment by giving

particular remark to the ontological model we used to

represent the gestural interaction. In Sect. 4, we introduce

the REGALS, i.e., the prototype we have realized to

demonstrate our approach, describing its hardware archi-

tecture, the algorithms for gesture recognition and its

behavior during usage in SOFIA use cases. Before drawing

conclusions in Sects. 6, in Sect. 5, we show the results of

the performance evaluation for the gesture recognition

algorithms and we describe in detail an implemented use

case for controlling a media player in a smart environment.

2 Background

In the proposed vision, smart environments have an asso-

ciated digital representation shared by multiple devices.

Smart environments support dynamic and context-aware

applications where artificial entities may collaborate and

adapt to the user needs with minimal visibility or impact on

user life. This scenario was first envisaged by Weiser [1]

who, analyzing the state of the art and the current trend of

technology advances, retained possible the integration of

multiple low-cost collaborating computers in a single

localized network. Nowadays, processors embedded in

everyday life objects are able to seamlessly connect users

with a wider information world, where user actions, as

1 This research is being developed within the framework of the

SOFIA project of the European Joint Undertaking on Embedded

Systems ARTEMIS http://www.sofia-project.eu.

944 Pers Ubiquit Comput (2012) 16:943–956

123

intuitive and simple as possible, or his/her implicit

behavior, are reflected into changes in the information

world itself [5].

Requirements to implement this scenario are:

– a Context Management System (CMS) accessible by all

the relevant interacting entities (i.e. smart objects,

devices and spaces), providing their context and being

able to drive their behavior in the most appropriate

way;

– a concrete interaction model hiding the complexity of

the system and its subsystems, supporting (or being

general enough to support) facilities for configuration,

personalization, extensibility and upgrades.

In recent years, several research studies focused on CMS

by proposing new solutions for specific domains or

classifying them according to different aspects, from

system architectures to information storing techniques.

CMS has been proposed in the field of cultural heritage [6]

or for supporting mobile applications [7]. There are also

frameworks helping developers to create context-aware

applications [8] and architectures usable in broad domains

[9].

Among the technical features, which distinguish differ-

ent CMS at the basis of a smart environment implemen-

tation, one of the most relevant for the present discussion is

the information representation and modeling. Information

is usually collected by devices provided with sensory

equipment and sent through some protocol to the storage

system which, in its turn, allows the software agents to

access the contents. Some of the most common information

representation techniques are key value pairs (e.g. used in

[10]), markup scheme based on a derivation from SGML

[11], logic based on expressions assertions and rules [12],

ontologies [13].

The work described herein is based on Smart M3 [14]:

the interoperability platform that is under development in

the SOFIA European project. A Smart M3-based smart

environment is domain agnostic, and since its information

is related to an ontology, applications and devices may

use a common vocabulary to enhance the interaction with

the user.

A smart environment has to be manipulated by the

humans to be accessible. For this reason, the interaction

models between the smart environment and the user are

one of the most important field of research in the context of

smart environments and not only. To improve the quality

of the interaction with the surroundings and with the

objects, natural 3D interaction paradigms are needed to

deal with the intrinsic immersive nature of Ambient

Intelligence. These paradigms are taken into account in the

design and creation of Tangible User Interfaces (TUIs),

which introduce tangible devices or smart objects that

augment the real physical world by coupling digital

information to everyday objects. Smart objects are portable

devices with a vast range of functions: their common fea-

ture is on-board integration of sensors, actuators and some

processing and communication capability. Furthermore, the

opportunity to execute on-board complex processing tasks,

such as gesture recognition algorithms, without the need to

send data streams from the local sensor to a central base

station, results in extended battery life, improved system

scalability and easier handling of mobile TUIs [15].

An example from literature of this is the Elope system

[16], which combines advanced mobile devices, interactive

spaces, and tagged objects to enable the complete config-

uration of an office-based Smart Space (based on a simple

user action), including launching the desired application

and loading a user’s personal data. Another example is the

work of Kranz et al. [17], which introduces netgets, spe-

cialized networked gadgets with sensors and actuators that

let users seamlessly manipulate digital information and

data in the context of real-world usage. GeeAir, a Wii-

Mote-based handheld device is presented in [18], for

remotely controlling home appliances. It implements a

mixed modality interface, composed of speech, gesture

recognition, joystick and buttons, outlining the need for the

availability of a set of different paradigm and means to

interact with the environment. In this case, an IR-Bluetooth

adapter is used to communicate with legacy devices and the

multimodal capabilities improve the usability, especially

addressing elderly and disabled people as possible users.

However, these works have the limit of interacting with a

‘‘passive’’ Smart Space, which is not pro-active in deter-

mining the configuration of the digital entities and in

changing their behavior depending on available informa-

tion or user preferences.

Important features that a smart environment should

provide to enhance interaction are the configurability and

the adaptability to different profiles. Once an interacting

object is configurable, it is possible to setup its interface in

order to communicate with other devices belonging to the

smart environment or to perform different tasks. The

adaptability consists in the possibility of doing the same

actions with different modalities (e.g. adapting the inter-

face to a user profile).

The proper usage of domain ontologies in a context-

aware architecture allows many different functionalities

(configurability). This is obtained by relating devices

internal events to different semantics according to the spe-

cific configuration. In Sect. 3, the interaction ontology will

be detailed, which implements concepts and relationships to

enable a controller to adapt to different usage profiles.

Some examples of this approach are currently being

studied in the SOFIA project. The work of [19] discusses

this fact, i.e. how to represent low-level interaction events

Pers Ubiquit Comput (2012) 16:943–956 945

123

to provide easy controls of and in a smart environment. In

[20] and [21], these concepts are implemented: in the

former, a camera-equipped handheld device is used to

assign digital information to a physical object, identified by

its picture. In the latter, a specific interaction device is

proposed: the Interaction Tile allows users to explore and

create semantic connections between different devices in

the smart environment.

The work here described contributes to this research

scenario not by simply providing a device integrated in

existing software architecture, but demonstrating a new

approach to the interaction between a human and the sur-

rounding environment. By mean of a semantic representa-

tion at information level of all the relevant concepts, the

gestures performed are separated from the function they

activate by providing complete adaptability to tasks and

preferences. The same object is able to identify RFID codes

or to recognize gestures, and through its wireless connec-

tivity will upgrade a smart environment not only with a new

functionality, but more generically with gesture-based nat-

ural interaction. Two different tasks have been implemented

(namely the ‘‘Smartification’’ and ‘‘Remote Control’’ tasks)

and will be described, but many other are possible. Con-

tributions from different research areas, such as ontology-

based interoperability in smart environments and natural

interaction (e.g. gesture recognition, tangible interfaces),

are combined to demonstrate, through the example of a

working prototype, the benefits of such merging.

3 Definitions and SOFIA ontology

In this section, we introduce the main entities in a Smart

M3-based [14] Smart Space where the REGALS will

perform its actions. Smart M3 is a CMS conceived by

Nokia to enable multi-vendor multi-device and multi-part

applications through interoperability at different levels of

abstraction. Without going into detail, there are two layers

of interoperability relevant for the current implementation:

the communication-level interoperability allows the dis-

tributed processes to communicate with a central entity,

while the interoperability at information layer is a quality

of the systems where the different interacting entities are

able to understand each other without ambiguities. Figure 1

shows the Smart Space software architecture and various

ways of connecting existing components to the smart

platform. The aim of the Smart M3 interoperability plat-

form, in fact, is not only to give a set of rules that, if

respected, allow developers to create multiplatform appli-

cations, but also to give methodologies to connect existing

technologies to the Smart Space. The compatibility with

legacy devices enables the early creation of applications

and services and a wider range of target platforms.

An adapter knowledge processor (KP) is a software

program, which queries, subscribes and inserts in the Smart

Space all information relevant to the legacy device con-

sidered. In (a), the adapter KP is directly embedded in the

legacy device turning it into a Smart Space-enabled device.

This clean solution is not always applicable because the

target device could be not programmable or resource

constrained. In this case, other solutions are possible like

implementing the adapter in the platform hosting the SIB

and to which the legacy is connected (b) or in a helper

device (c) as it naturally happens for wearable sensors [22].

In (d), a situation is represented, which happens for mod-

ular devices. With this kind of hardware, the KP runs in the

module controlling the wireless interface, while the other

modules are interfaced to the system through drivers

(D) that take into account the domain specification in

semantic format. This is the case of the REGALS that will

be detailed later.

Interoperability at communication level is obtained

through the Smart Space Access Protocol (SSAP) used by

all the devices to interact with the central context reposi-

tory, called Semantic Information Broker (SIB). The

communication between two devices is compliant with the

software architecture if mediated by the SIB: it also pro-

vides an efficient support for subscription-notification

mechanism. The interoperability at information level is

provided by the adoption of machine readable technologies

for data representation. In particular, the RDF [23] triple

hsubject, predicate, objecti is used as the ele-

mentary piece of information in which the subject and

the predicate are represented by unique identifiers

(URI), while the object may be a URI or a literal value.

OWL [24] ontologies specify from a higher perspective the

domain of interest in a way that, by referring to them,

software developers are able to create KPs consistent with

data semantics.

Fig. 1 Representation of SOFIA architecture and knowledge

processors

946 Pers Ubiquit Comput (2012) 16:943–956

123

Ontologies are knowledge representation models, based

on labelled directed graphs, aimed at describing a domain

of interest in a machine readable format. Ontologies can be

used for many purposes as defining a common language on

which multivendor device may be based to communicate; it

is also possible to reason over context, since it is possible

to associate with an ontology the description logics [25]

or to do consistency checks to verify that the set of the

triples asserted in the Smart Space corresponds to a valid

configuration.

A good ontological solution describes the relevant

entities, their properties and their relationship until a level

of detail dependent by the application needs. In our

ontology development, we worked in a collaborative sce-

nario where different working groups worked on different

aspects of the target smart environment. The natural

property of ontologies to be extendible and well under-

standable, thanks to the many software instruments avail-

able for editing and visualization, allows this kind of

modular and collaborative development.

The interaction ontology or, to better say the semantic

infrastructure, is used to allow interaction with a generic

appliance through the insertion in the device’s state of the

recognized RFID and gestures. The ontological model to

control an appliance in a SOFIA smart environment has

been realized by Niezen et al. [19], as a derivation from the

considerations made by the recently established W3C Web

Events Working Group [26]. The W3C Events Working

group has defined a layered conceptual model for interac-

tion with multi-touch and pen-tablet input devices. The

four layers have different levels of abstraction: starting

from the physical layer to a more abstract representational

layer; among these three are used to construct the current

model for the REGALS behavior, this because our device

is simpler than a generic SOFIA device and than the target

of the conceptual modeling from which the interaction

ontology was originated.

In Fig. 2, we report a piece of ontology that has been

used when modeling the REGALS as a SOFIA interacting

device. As it will be detailed in Sect. 4, the REGALS will

be an instance of smart object provided with a certain

number of interaction primitives; some of the interaction

primitives are mapped to the events that represents the

functionalities the user wants to activate by performing the

corresponding gesture. With reference to the previously

cited levels of abstraction, the interaction primitives are

something in the middle between the phisical layer and the

gestural layer because they are related to the gestures

physically performed and to their mapping to the actions;

the intentional layer, related to the actions the user wants to

perform, is represented by the event class. Events may be

activated through different means, such as a button press,

gesture or voice command, all referring to the same

intention. A semantic transformer is responsible for inter-

preting multimodal interaction data coming from different

smart objects and data sources into possible user goals and

maps them onto the plurality of available services. This

high level of abstraction enables developers to write

applications, which will work across different devices and

services, without having to write specific code for each

possible input device or used service.

In this work, we will show how the Interaction Ontology

can be used to describe a category of gesture based and

reconfigurable smart objects. The specificity of these

devices is that they are not committed to a unique inter-

action domain, so that they are inherently very general and

their semantic transformers are mapped by the Smart

Space. In other words, a KP can dynamically assign it a

function, which therefore is inherently context dependent.

Through the implementation of a specific device, the

already mentioned REGALS, we will show how the

Interaction Ontology may be used to tailor its functionality:

new functions may be assigned by the Smart Space to the

REGALS just by adding new mappings between gestural

and intentional events.

4 REGALS

REGALS (Reconfigurable Gesture based Actuator and

Low Range Smartifier) is a smart object, seen as interaction

device, which enables the user to interact with a smart

environment, through objects that have RFID tags or a

digital interface toward the SIB.

After reading the RFID tag of an object, the REGALS

can change its digital properties, which are stored in the

Fig. 2 Sofia interaction

ontology: view of the most

relevant classes and properties

used for REGALS development

Pers Ubiquit Comput (2012) 16:943–956 947

123

SIB (Fig. 3a). If the object has a network interface to the

SIB, it can also receive commands or information from it,

as shown in Fig. 3b below. This device has gesture rec-

ognition capabilities, enabled by an accelerometer which

senses its movements and provides a simple and natural

way of interaction with the user.

4.1 Physical layer: hardware and software description

The first prototype of the device is based on a GUMSTIX

board and is equipped with an RFID reader, an inertial

sensor module (microcontroller and accelerometer), three

LEDs and a PWM vibro motor to provide feedback to the

user. Figure 4 summarizes the block diagram of the hard-

ware and firmware blocks.

The control unit is the Gumstix Verdex Pro XL6P COM

board [27], which has a Marvell PXA270 600MHz pro-

cessor, 128MB SDRAM memory and several connectors

for expansion boards. Ethernet and Wi-Fi capabilities are

provided via an additional module and allow the device to

communicate with the SIB. It runs a Linux kernel, opti-

mized for the embedded processor; implements the main

features of the KPs, taking care to maintain the wireless

connection with the SIB and coordinating the activity of

the other components.

The RFID reader is used to identify the SS entity to

interact with. The current REGALS implementation uses

the TagSense Nano-UHF [28]. It operates at UHF fre-

quencies (865–868 MHz, 902–928 MHz) and reads and

writes RFID tags following the EPC Gen 2 standard. The

6dbi dipole antenna is connected externally by means of a

SMA Jack.

The gesture recognizer embeds a low-cost, low-power 8-

bit microcontroller (Atmel ATmega168) and a MEMS tri-

axial accelerometer (STM LIS3LV02DQ) with a pro-

grammable full scale of 2 or 6g and digital output. This

module is connected to the Gumstix with a serial connec-

tion and implements a message-based communication

protocol to exchange information. The accelerometer sen-

ses the orientation and the movements of the device, which

are Physical Level events and are the actions taken from

the user to interact with the device. The microcontroller

implements a gesture recognition algorithm and transforms

raw movements into Interaction Primitives passed to higher

levels.

The firmware part, which runs on the Gumstix, is made

up of a Python framework. The framework is formed by a

folder (called Engine) which contains the legacy adapters

(at present the RFID Reader and the Inertial Board adapt-

ers), the SSAP libraries and the run program. The last one

launches all the communication functionalities and sear-

ches for the applications eventually contained in the

Applications folder that can be performed. Every appli-

cation is actually an implementation of a KP for the device,

and they can be navigated and launched by means of a

menu. As we will see in the Sect. 4.3, currently the

Smartification and the Remote Control applications are

implemented as example. This software architecture

enables a modular and dynamic development of new

functionalities: new applications can be easily added or

executed from the Applications folder using low-level

functionalities of the Engine folder.

Fig. 3 Representation of the

interaction model of the device:

a the user reads the RFID tag of

a device and exchanges its

information with the SIB; b the

device can be controlled via SIB

Fig. 4 Hardware blocks of the smart object

948 Pers Ubiquit Comput (2012) 16:943–956

123

4.2 Gestural layer: algorithm

Gestures executed with natural hand and arm movements

are variable in their spatial and temporal execution,

requiring classifiers suited for temporal pattern recognition.

Typical approaches include Dynamic Time Warping

(DTW) [29], Neural Networks [30] and Hidden Markov

Models (HMMs), [31–33]. HMMs are often used in

activity recognition since they tend to perform well with a

wide range of sensor modalities and with temporal varia-

tions in gesture duration. Several variants of HMMs have

been proposed to recognize inertial gestures: in [31],

5-state ergodic discrete HMMs are evaluated with the Vi-

terbi algorithm to classify gestures performed with a

handheld sensor. The work of Mantyla et al. [32] uses

7-states Left-to-Right models and the forward algorithm to

classify actions performed with a mobile phone equipped

with an accelerometer. Both implementations have similar

performance and rely on a PC to execute all computations.

In our work, we are using low-power hardware without a

floating point unit, thus we implemented a fixed-point

variant of the forward algorithm, presented in a previous

work [33].

Using HMMs to classify gestures from a continuous

stream of data brings another issue to solve: the recognition

procedure needs to discriminate actually executed gestures

from all the other arbitrary movements. Hoffman et al. [34]

use a sensorized glove to recognize hand gestures: to

segment the data stream, they compute the velocity profile

of the sampled accelerations and apply a threshold to

identify the motion segments. In [35], a Gaussian model of

the stationary state is used with a sliding window approach

to find pauses in movements, which identify the beginning

and the end of a gesture. While those works have focused

to develop segmentation and recognition solutions, none of

them deals with computation or memory limited devices.

We found a similar solution implemented on a wristwatch

device, using a 32-bit ARM microcontroller [36], but there

are no works targeting low-cost, low-power 8-bit micro-

controllers, such that the Atmel ATmega168 used in this

work (Fig. 5).

In our case, gestures can occur only at specified

moments (i.e. after reading an RFID tag) and the device

signals to the user when to execute a gesture. All gestures

begin and end with the user holding still the device in the

same position, and this condition is used to fine segment

only the desired gesture. The recognition algorithm trans-

forms raw movements into interaction primitives and can

trigger higher-level events. To improve the usability and

the recognition performance, we choose a limited set of

gestures, such as changes in orientation, continuous rota-

tions of the device around a horizontal axis, or sample

directional movements, as illustrated in Fig. 6.

The main feature used for gesture recognition is the

direction of the movement, represented by the direction of

the acceleration vector, which is sampled at each frame.

This information is obtained converting the 3D accelera-

tion vector {ax, ay, az} in spherical coordinates, and using

only the two angles {/, h}. The two angles calculated,

which identify the arbitrary 3D orientation of a unitary

vector, are quantized to the nearest vector of a 26 entry

uniform codebook by a minimum distance classifier. To

efficiently compute the two angles of the acceleration

vector, we used an implementation of the CORDIC algo-

rithm. Using the notation in Fig. 7, this algorithm first

estimates the phase / and the magnitude r0 of the complex

number (ax ? iay), then again estimates the angle /, using

r0 and az. All computations are done with integer values,

giving us a resolution of 1� and a maximum error of 2�,

which is acceptable since we are dealing with human

motion and we do not need higher accuracy.

The off-line HMM training phase builds a model for

each of the gestures to recognize, using sample instances of

the gestures. We used the Baum–Welch algorithm and

initialized the training models with several random prob-

ability distributions. Among the resulting HMMs, those

with the lowest training error were chosen. The on-line

recognition algorithm evaluates the executed gesture with

all of the trained models and selects the model with the

highest probability. For this purpose, we used a fixed-point

version of the forward algorithm, as implemented in [37].

This implementation deals with the lack of a division unit

in the low-power microcontroller embedded in the device

Fig. 5 The REGALS prototype Fig. 6 Gestures recognized by the algorithm

Pers Ubiquit Comput (2012) 16:943–956 949

123

and proposes a different scaling procedure that uses shifts

and a logarithmic representation of the probabilities. Our

previous work compared the performance of this imple-

mentation against a standard floating point algorithm. The

results showed that a 16-bit fixed-point algorithm has the

best trade-off between classification rate and computational

complexity.

4.3 Mapping on intentional layer: application cases

In this section, we will see how the Semantic Interaction

Ontology and the higher-level interaction layer, introduced

in 3 can be implemented on a smart object such is the

REGALS. The Smart Space may be used to enhance and

control its functionality, selecting different applications or

changing the semantic transformations between the four

layers.

The modular software architecture adopted enables a

dynamic selection between all the available applications,

changing the overall functionalities of the device. Having

different applications for different tasks is a standard way

to add flexibility to a device, but thanks to the support of a

Smart Space, this device can be reconfigured even without

changing the application. Using the Semantic Interaction

Ontology and the abstraction layers between user actions

and desired events in the digital world, transformations

between gestures and events can be chosen every time,

according to the needs of the application or the user,

making the device fully reconfigurable. This selection is

performed by the SIB and can be implemented depending

on the RFID tag of the selected device or using context

information from the Smart Space. Two different applica-

tions have been developed for the smart object, so far:

Smartification and Remote Control.

Smartification is the word we use to name the process

of creating the digital representation of some physical

objects: the Smart Space is made aware of the relevant

physical world entities, their identification and the spatial

relationships between them. In this scenario, all the phys-

ical objects involved are tagged with a unique RFID tag.

To create a digital representation of one of those objects,

we need to pair the object with a unique identifier. Figure 8

represents the ontology graph that is created during the

Smartification process.

Figure 9 represents the sequence diagram of the inter-

actions between the Smartification involved actors: after the

selection of the application, with the corresponding gesture,

the user selects a desired object, reading its RFID tag. If it is

not already in the SIB, then its digital representation is

created, as an instance of the Object class (Object_

UUID) coupled with an instance of the Identifica-

tionData class, which has the value of the RFID tag. To

create positional information, the user can identify a room

by reading the corresponding RFID tag and performing the

addRoom gesture. This is equivalent to create an instance

of the Environment class (Room_UUID) and to asso-

ciate with this an instance of the IdentificationData

class (Identification_UUID) which has the value of

the room RFID tag. To locate the object into the room, after

the selection of the Smartification application, the user

reads the room and the object identifiers; this association is

Fig. 7 Representation of the acceleration vector and the discrete

codebook

Fig. 8 Smartification ontology

950 Pers Ubiquit Comput (2012) 16:943–956

123

created by means of the ContainsEntity property,

which connects the room and the object.

In the Remote Control application, REGALS is used to

control a digital device, which has an interface to the SIB

and is identified by an RFID tag. In this scenario, the user

selects the device reading its tag and then performs the

gesture corresponding to the command he wants to execute

(e.g. the user selects a Media Player Device and performs a

gesture to start playing some music). The Fig. 10 repre-

sents the flow diagram of the operations executed during

this application. After the selection of the application, the

user reads the desired devices tag, so that the SIB associ-

ates the REGALS with the device to control using the

connectedTo propriety. The user can now control the

device using the appropriate gestures. Having the SIB to

handle all the communications and the semantic transfor-

mations, the REGALS is compatible with each device that

has an interface toward the SIB, implemented through an

appropriate KP.

The Remote Control application is also an example of

the reconfigurable interaction proprieties of this device,

when interfaced with the Smart Space. According to the

Semantic Interaction Ontology, the smart object has a set

of Interaction Primitives and every controlled object has a

set of actions. Depending on the controlled device selected

and context information stored in the SIB, the Gesture

versus Interaction Primitive versus Action mapping is

reconfigured. The dynamic reconfiguration of the REGALS

is composed of the following steps:

1. Smart object initialization. When the smart object joins

a new Smart Space, it uploads its digital representation

to the SIB. In this representation, the most relevant

information are the Device Identifier and the available

Interaction Primitives.

2. Connection between the smart object and the con-

trolled device (e.g. a Media Player). The user selects

the device to control by reading its RFID tag. The

digital representation of the controlled device is

supposed to be already in the SIB.

3. Smart object reconfiguration. The Interaction Primitives

are mapped to the actions supported by the controlled

device. This mapping can be influenced by different

factors, such as user preferences, controlled device status

or other context information available from the smart

environment.

The above described steps can be recognized in the

following Fig. 11 where a simplified example of an

ontology graph is shown, reproducing the remote control

application. In the graph, the device, identified as Smart-

Object_UUID, has joined the Smart Space and has been

connected to the controlled device MediaPlay-

er_UUID, after reading its tag. Therefore, the SIB has

created and added the interaction ontology sub-graph

regarding the mapping between the available interaction

primitives and services, shadowed in the figure. In this

example, the smart object has one Interaction Primitive,

the gesture MoveDown and it has been associated with the

event PlayEvent, using the semantic transformation

canBeTransformedTo. Now when the user performs

the gesture MoveDown, the SIB triggers the PlayEvent

and the player executes the correspondent action, playing

some media.

5 Experimental tests

After the construction of the prototype, we tested its main

functionalities. First, we did a validation of the gesture

recognition algorithm, to ensure the correct behavior of this

Fig. 9 Activity diagram of the smartification application

Fig. 10 Action diagram for the remote control application

Pers Ubiquit Comput (2012) 16:943–956 951

123

input method. Second, we set up a SIB and used the

remote-control application with example devices. We

simulated some context and user information to verify the

correct integration between the device and the Smart

Space.

5.1 Gesture recognition tests and results

For the validation of our algorithm, we used a set of 7

gestures, illustrated in Fig. 6. All gestures are formed by

natural movements, they start and end with the user hold-

ing the cube in the same position and are executed on the

vertical plane in front of the user, holding the device every

time with the same orientation.

We collected gestures executed by four people, all male

students with an average age of 26 years. To build and

validate the HMMs each user executed 80 instances of

every gesture, during different days. Those gestures were

continuously executed, with a few seconds of interval

between two consecutive instances. Gestures performed by

three users were used for both modeling and validation,

while gestures from the fourth user were used only to test

the models. The whole dataset was collected with the

REGALS prototype and stored in a PC. No feedback from

the device or the PC was given to the users during the

execution of the gestures. To easily test the performance,

the algorithm was implemented in Matlab, taking care to

simulate the computational constrains of the 8-bit micro-

controller and using only integer computations with con-

trolled variable size.

In the first place, we used the collected dataset to train a

set of HMMs for each user, applying the floating point

notation with double precision. Each model has been

trained using 15 reference instances, 15 loops for the

Baum–Welch training algorithm and 10 initial random

models. The floating point models were then converted

in fixed point, represented only by 16-bit integers. Each

user’s models were validated with his/her own gestures, not

used in the training phase, and with gestures from other

users.

The results show how the classifier performs well in a

single user scenario with recognition rates up to 99.7%.

The algorithm has some limitations in a multi-user sce-

nario, particularly when recognizing gestures from a user

with models trained by another user. We found that inter-

polating the trained models with a uniform, one gave some

advantages and Table 1 shows the classification rates for

the various users in this case; Table 2 shows the classifi-

cation matrix in the best case.

To overcome the limitations in the multi-user scenario,

we put together all the gesture instances, regardless to the

user who executed them, and build a global model for each

gesture. These models were trained using 15 randomly

chosen gestures and validated on 200 gestures from four

users. The results of this case are presented in Table 3,

where we can observe how in this case we have a perfor-

mance comparable to the single user scenario, achieving a

classification rate of 94.6% despite we are classifying

gestures from all the users. We can also notice that the

fixed-point implementation has performance comparable to

Fig. 11 Remote control ontology

952 Pers Ubiquit Comput (2012) 16:943–956

123

the floating point one, and our algorithm is suitable for

low-performance smart objects.

5.2 Practical application: remote control

As seen, a Smart Space plus the use of the Semantic

Interaction Ontology turns the REGALS from a normal

interaction device into a reconfigurable smart object,

potentially enhanced with context information and the user

profile and preferences. These features have been demon-

strated by means of two applications, Smartification and

Remote Control, and we here present the second one to

illustrate the functionalities of the device.

The simple application scenario used to evaluate the

system is structured as follows: the SIB runs on a PC and

manages all the connections between the devices; the same

PC runs a KP to enable the control of some LEDs, con-

nected through a serial adapter, and a digital frame with

Ethernet connection runs a KP, becoming another con-

trolled device. The LEDs, simulating a generic lighting

device, are an example of a legacy device which is inter-

faced to the SIB through an adapter KP; while the digital

frame, which has been reprogrammed to run a simple

media player and the corresponding KP, is an example of a

Smart Space-enabled device. Both devices have an asso-

ciated RFID tag to identify them, and some actions that can

be performed by the user (e.g. LED on/off or play, stop,

next, previous for the digital frame). To test the reconfig-

urable proprieties of the REGALS, we created two user

profiles, each having different gesture mappings to the

various actions available.

Each user performed a simple set of actions, consisting

of turning on and off the LEDs, starting and stopping some

media on the digital frame (e.g. videos, pictures slide-

show). When a user selects one of the two devices, the

corresponding gestures mapping is activated, so that he/she

can control it in the desired way. When the same user

selects the other device, the gestures mapping is automat-

ically updated by the SIB to control the new device, thus

the interaction proprieties change not only according to

user preferences, but also depending on the device

involved. The rest of this section illustrates more in detail

the actions and events performed during the interaction

with one device, presenting the example of the media

player control (Fig. 12).

The digital frame runs a media player application, has an

Ethernet interface by which communicates with the SIB

and it runs a consumer KP which is subscribed to the

InteractionsPrimitive events and performs the

actual MediaPlayerEvents.

In Fig.13, a screenshot of the SIB explorer is shown at

the end of a PlayEvent action, according to the Remote

control ontology graph presented above. We highlight the

most important passages:

1. Smart object initialization:

(a) The REGALS inserts its instance (Smart

Object_01)

(b) The REGALS inserts the InteractionPrim-

itive sub-graph. In this case: MoveDown,

MoveUp, MoveLeft, MoveRight

2. Media player initialization:

(a) A MediaPlayer instance MediaPlayer_01 is

inserted, with its identification (Identification_

01)

(b) The media player inserts the available events

3. The smart object connects to the media player using

the connectedTo property: hSmartObject_01,connectedTo, MediaPlayer_01i

Table 1 Classification rates in multiuser scenario

Training set Validation set

User 1 User 2 User 3

User 1 0.99 0.66 0.85

User 2 0.94 0.92 0.85

User 3 0.93 0.71 0.99

Table 2 Classification rate in best case

Gest. Classified as

Up Right Down Left Cir. Sq. X

Up 62 0 0 0 0 3 0

Right 0 65 0 0 0 0 0

Down 0 0 65 0 0 0 0

Left 0 0 0 65 0 0 0

Circle 0 0 0 0 65 0 0

Square 0 0 0 0 0 65 0

X 0 0 0 0 0 0 65

Table 3 Classification rate for global model

Gest. Classified as

Up Right Down Left Cir. Sq. X

Up 194 0 3 0 2 1 0

Right 0 187 1 11 0 1 0

Down 1 0 199 0 0 0 0

Left 0 1 0 199 0 0 0

Circle 0 0 0 4 177 19 0

Square 0 0 0 0 13 187 0

X 3 0 12 0 1 2 182

Pers Ubiquit Comput (2012) 16:943–956 953

123

4. The InteractionPrimitives are mapped to the

events using the canBeTransformedTo property.

In this case, they arePauseEvent, PlayEvent,

NextTrackEvent, PreviousTrackEvent

5. Smart object in action:

(a) The smart object is subscribed to dataValue of

its InteractionPrimitive.

(b) When the user executes one of the gestures, the

smart object creates and launches an Event, in

this case Event_01.

(c) The smart object assigns to the Event one

EventType with respect to the selected gesture

and the semantic mapping, which has been

configured by the SIB at the time of the

connection.

In this example, we have recognized the MoveDown

gesture, which is transformed to a PlayEvent and

corresponds to the Play action on the media player, which

starts displaying the desired media content.

6 Conclusion

In this paper, we presented an example of how next gen-

eration smart environments, based on semantic technolo-

gies and context management systems, are suitable for

multimodal and natural TUIs. From a well-known model

proposed for ontology-based interaction, it was possible to

devise a set of interaction mechanisms and information

flows within the smart environment developed on Smart

M3, which made available all different kinds of actuators

to be operated in a natural and intuitive way. Furthermore,

Fig. 12 The REGALS and the digital frame

Fig. 13 Screenshot of the SIB

explorer, showing a running

instance of the SIB, with

highlighted operating steps as

described in the text below

954 Pers Ubiquit Comput (2012) 16:943–956

123

we presented REGALS, a prototype physical device and

put these principles in practice by using low-cost pro-

grammable components and smart on-board algorithms to

perform natural and tangible interaction paradigms, in

specific enabling the use of gestures.

The demonstrative features implemented were chosen to

highlight the generality of the interaction model applied to

REGALS and of the support provided by the Smart Space:

the Smartification works only at information level to pro-

vide to the other Smart Space agents a more complete

knowledge, while the controller functionality uses the

available context (e.g. controlled entity, user profile) to

properly adapt its behavior according to it. The analysis of

the gesture recognition algorithm showed that the perfor-

mance is adequate for a smooth interaction, while, at the

same time, the implementation of advanced algorithms on

the object itself opens the possibility to manage appropri-

ately an increased number of devices and their power

resources, to prolong device lifetime, by minimizing wire-

less transmission. This specific aspect, which has a direct

impact on device usability and comfort, will be addressed in

future work, along a more exhaustive study of the usability

of the device when integrated in a Smart Space.

Acknowledgments The Authors would like to thank Luca Faggia-

nelli for his work and assistance on the realization of the REGALS

prototype. This work was carried out within the framework of a

project of the European Joint Undertaking on Embedded Systems

ARTEMIS. The project is called SOFIA (2009–2011), it is coordi-

nated by NOKIA and it is co-funded by the EU and by National

Authorities including MIUR, the Italian Central Authority for Edu-

cation and Research.

References

1. Weiser M (1999) The computer for the 21st century. SIGMO-

BILE Mob Comput Commun Rev 3(3):3–11

2. Monekosso DN, Remagnino P, Kuno Y (2009) Intelligent envi-

ronments: methods, algorithms and applications. In: Monekosso

D, Kuno Y (eds) Advanced information and knowledge pro-

cessing, 1st edn. Springer, Berlin, p 211. http://www.springer.

com/computer/ai/book/978-1-84800-345-3

3. Lopez T, Ranasinghe D, Patkai B, McFarlane D (2009) Taxon-

omy, technology and applications of smart objects. Inform Syst

Front 1387(3326):1–20

4. Bartolini S, Roffia L, Salmon Cinotti T, Manzaroli D, Spadini F,

DElia A, Vergari F, Zamagni G, Di Stefano L, Franchi A, Farella

E, Zappi P, Costanzo A, Montanari E (2010) Creazione autom-

atica di ambienti intelligenti. Patent Pending, March 2010,

BO201A000117

5. Schmidt A (2000) Implicit human computer interaction through

context. Pers Ubiquit Comput 4(2):191–199

6. Ryan N (2005) Smart environments for cultural heritage. In:

Takao UNO (ed) Reading historical spatial information from

around the world: studies of culture and civilization based on

geographic information systems data

7. Salmeri A, Licciardi CA, Lamorte L, Valla M, Giannantonio R,

Sgroi M (2009) An architecture to combine context awareness

and body sensor networks for health care applications.

International Conference on Smart Homes and Health Telemat-

ics. Springer, Berlin, pp 90–97

8. Dey A, Salber D, Abowd G (2001) A conceptual framework and

a toolkit for supporting the rapid prototyping of context-aware

applications. Hum Compt Interact 16(2, 3 and 4):97–166

9. Honkola J, Laine H, Brown R, Oliver I (2009) Cross-domain

interoperability: a case study. In: International conference on

smart spaces and next generation wired/wireless networking and

2 conference on smart spaces (NEW2AN09 and ruSMART09),

pp 22–31

10. Schilit WN (1995) A system architecture for context-aware

mobile computing. PhD thesis, Columbia University

11. Overview of SGML resources: http://www.w3.org/MarkUp/

SGML

12. Ranganathan A, Campbell RH (2003) A middleware for context-

aware agents in ubiquitous computing environments. In: Endler

M, Schmidt DC (eds) Middleware, vol 2672 of lecture notes in

computer science, pp 143–161

13. Gruber TR (1993) A translation approach to portable ontology

specifications. Knowl Acquisition 5(2):199–220

14. Smart-M3 public source code: http://sourceforge.net/projects/

smart-m3/

15. Mantyjarvi J, Paterna F, Salvador Z, Santoro C (2006) Scan and

tilt: towards natural interaction for mobile museum guides. In:

Conference on human-computer interaction with mobile devices

and services, pp 191–194

16. Pering T, Ballagas R, Want R (2005) Spontaneous marriages of

mobile devices and interactive spaces. Commun ACM 48(9):

53–59

17. Kranz M, Holleis P, Schmidt A (2010) Embedded interaction:

interacting with the internet of things. IEEE Internet Comput

14(2):46–53

18. Pan G, Wu J, Zhang D, Wu Z, Yang Y, Li S (2010) GeeAir: a

universal multimodal remote control device for home appliances.

Per Ubiquit Comput 14(8):723–735

19. Niezen G, Van der Vlist B, Hu J, Feijs L (2010) From events to

goals: supporting semantic interaction in smart environments. In:

The IEEE symposium on computers and communications,

pp 1029–1034

20. Franchi A, Di Stefano L, Cinotti TS (2010) Mobile visual search

using Smart-M3. In: The IEEE symposium on computers and

communications, pp 1065–1070

21. Van der Vlist B, Niezen G, Hu J, Feijs L (2010) Semantic con-

nections: exploring and manipulating connections in smart

spaces. IEEE Symp Comput Commun 1-4

22. Vergari F, Bartolini S, Spadini F, D’Elia A, Zamagni G, Roffia L,

Cinotti TS (2010) A smart space application to dynamically relate

medical and environmental Information. In: Design Automation

& Test in Europe (DATE10), pp 1542–1547

23. http://www.w3.org/RDF

24. http://www.w3.org/TR/owl-ref

25. Horrocks I, Kutz O, Sattler U (2006) The even more irresistible

SROIQ. In: International conference of knowledge representation

and reasoning, pp 57–67

26. http://www.w3.org/2010/webevents/charter

27. http://www.gumstix.com/store/catalog/product_info.php?products_

id=210

28. http://www.tagsense.com/index.php?option=com_content_&view

=article&id=142:nano-uhf&catid=49:uhf-readers&Itemid=117

29. Kim L, Cho H, Park SH, Han M (2007) A tangible user interface

with multimodal feedback. In: International conference on

human-computer interaction, pp 94–103

30. Bailador G, Roggen D, Trster G, Trivino G (2007) Real time

gesture recognition using continuous time recurrent neural net-

works. In: International conference on body area networks

(BodyNets), Article n15

Pers Ubiquit Comput (2012) 16:943–956 955

123

31. Kela J, Korpip P, Mntyjrvi J, Kallio S, Savino G, Jozzo L, Marca

D (2006) Accelerometer-based gesture control for a design

environment. Personal Ubiquitous Comput 10(5):285–299

32. Mantyla V-M, Mantyjarvi J, Seppanen T, Tuulari E (2000) Hand

gesture recognition of a mobile device user. IEEE Int Conf

MultiMed Expo 1(c):281–284

33. Zappi P, Milosevic B, Farella E, Benini L (2009) Hidden Markov

model based gesture recognition on low-cost, low-power tangible

user interfaces. Entertain Comput 1(2):75–84

34. Hofmann FG, Heyer P, Hommel G (1997) Velocity profile based

recognition of dynamic gestures with discrete Hidden Markov

models. In: Gesture and sign language in human-computer

interaction, international gesture workshop. Springer, Berlin,

pp 81–95. http://www.springerlink.com/content/wju4v1620833

6502/about/

35. Chambers GS, Venkatesh S, West GA, Bui HH (2004) Seg-

mentation of intentional human gestures for sports video anno-

tation. In: International Multimedia Modelling Conference,

pp 124–130

36. Amstutz R, Amft O, French B, Smailagic A, Siewiorek D, Troster

G (2009) Performance analysis of an HMM-based gesture rec-

ognition using a wristwatch device. Int Conf Comput Sci Eng

02:303–309

37. Milosevic B, Farella E, Benini L (2010) Continuous gesture

recognition for resource constrained smart objects. In: Proceed-

ings of the fourth international conference on mobile ubiquitous

computing, systems, services and technologies, UBICOMM

2010. pp 391–396

956 Pers Ubiquit Comput (2012) 16:943–956

123