reconfigurable natural interaction in smart environments: approach and prototype implementation
TRANSCRIPT
ORIGINAL ARTICLE
Reconfigurable natural interaction in smart environments:approach and prototype implementation
Sara Bartolini • Bojan Milosevic • Alfredo D’Elia •
Elisabetta Farella • Luca Benini • Tullio Salmon Cinotti
Received: 7 March 2011 / Accepted: 19 August 2011 / Published online: 13 September 2011
� Springer-Verlag London Limited 2011
Abstract The vision of sensor-driven applications that
adapt to the environment hold great promise, but it is dif-
ficult to turn these applications into reality because device
and space heterogeneity is an obstacle to interoperability
and mutual understanding of the smart devices and spaces
involved. Smart Spaces provide shared knowledge about
physical domains and they inherently enable cooperative
and adaptable applications by keeping track of the semantic
relations between objects in the environment. In this paper,
the interplay between sensor-driven objects and Smart
Spaces is investigated and a device with a tangible interface
demonstrates the potential of the smart�space�based and
sensor�driven computing paradigm. The proposed device
is named REGALS (Reconfigurable Gesture based Actua-
tor and Low Range Smartifier). We show how, starting from
an interaction model proposed by Niezen, REGALS can
reconfigure itself to support different functions like Smart
Space creation (also called environment smartification),
interaction with heterogeneous devices and handling of
semantic connections between gestures, actions, devices,
and objects. This reconfiguration ability is based on the
context received from the Smart Space. The paper also
shows how tagged objects and natural gestures are recog-
nized to improve the user experience reporting a use case
and the performance evaluation of REGALS’ gesture
classifier.
keywords Ambient intelligence � Smart Space �Tangible user interface � Gesture recognition
1 Introduction
The quality of our life is profoundly affected by the nature
of our interactions, not only with other human beings, but
also with objects and environments with which we have to
deal every day. If we focus on some specific categories of
people, this becomes evident: elderly and disabled people
are the most aware of how important can be a smooth and
easy interaction with home appliances and other typical
physical affordances (e.g. doors, windows) or infrastruc-
tures (e.g. lights, HVAC systems) that they encounter in
their routine.
For such interactions to be really satisfactory and
comfortable, both environments and objects should be
smart, so that they can take an active role in the interaction
with the humans, capable to understand the situations and
react in the fastest and more appropriate way. To accom-
plish that, they must be versatile to act differently
according to the context. Furthermore, all interactions
should minimize the need of explicit commands and should
be based on the natural expressions of inter-human
communication.
If we look back to the 1991, the idea of people inter-
acting with a Smart Environment and with objects in a
seamless way was considered a vision coming from the
intuition of the upcoming fast progress in communication,
processing, and sensing technologies [1]. To resume this
ensemble of concepts, the term Ambient Intelligence has
been coined, as summarized in [2].
Today, we assist to the proliferation of conceptual
models, real implementations, single applications, etc. all
S. Bartolini � A. D’Elia � T. S. Cinotti
ARCES—Universita di Bologna, Via Toffano 2,
40125 Bologna, Italy
e-mail: [email protected]
B. Milosevic � E. Farella (&) � L. Benini
DEIS—Universita di Bologna, Viale Risorgimento 2,
40136 Bologna, Italy
e-mail: [email protected]
123
Pers Ubiquit Comput (2012) 16:943–956
DOI 10.1007/s00779-011-0454-5
contributing to transform Monekosso’s vision in a fact.
There are many practical experiences where the space is
made able to understand what is going on in its interior by
means of suitable sensors, communicating wirelessly with
each other to form wireless sensors networks (WSN) and
improve flexibility and installation cost. Likewise, objects
can be made to think by means of embedded microcon-
trollers or smart tags (based on different technologies, such
as RFID, fiducial markers, bar codes [3]), associated with
software applications that link the object to services,
behaviors, and actions. All the interactions between
humans, devices and the smart environment can take place
via natural interfaces, exploiting voice, gestures and
expressions as the humans normally do when interacting
with each other.
However, many practical implementations lack of
abstraction, interoperability or scalability and require
considerable work to be reapplied in a different environ-
ment. In particular, limitation applies to interaction devices
and smart objects that are not thought to be universal and
programmable and are not able to behave differently
according to which appliance they are interacting with.
A step further toward the applicability of Ambient
Intelligence (AmI) in everyday life is therefore the avail-
ability of a Smart Space intended as a digital entity, a
Context Management System, where relevant real-world
information is stored and kept up to date. The Smart Space
can support upgrading, flexibility, interoperability and
modularization. New entities are added as soon as they
emerge (e.g. physical objects, digital objects, devices,
sensors, smart rooms). But more than this, the Smart Space
enables flexible handling of semantic connections and
interactions between these entities. As a consequence, an
object can have a different behavior or function depending
on the overall context of the environment, the profile of the
user that interacts with it or her/his intentions, the entity
with which is interacting with, etc.
In this work, we therefore present a combination of
middleware and smart objects that enable Ambient Intel-
ligence applications supporting gesture-based interaction.
We address the need to provide a general model and some
practical and real implementations to interact with the
Smart Space, targeting not only interaction with appliances
and technology, but also the automatic creation of the
Smart Space itself. In fact, a new solution to create auto-
matically the Smart Space (i.e. the digital representations
of real environments) starting from a physical space (a city,
a private or public space, a vehicle) is described, containing
real objects which need to be identified, located or moni-
tored (common objects, devices, furniture), calling it
‘‘Smartification’’ process [4].
Having created in a fast and easy way a Smart Envi-
ronment, it becomes possible to take advantage of the
benefits of the Smart Space, carrying out the following
information level operations:
– Association of selected properties with the state of the
objects (e.g. location, type, description)
– Object state monitoring
– Object state update and notification
The Smartification can have many different implementa-
tions, and SOFIA,1 the project supporting this research, is
addressing some of them. To prove the concept, we
introduced an example of a tangible interface that is able to
capture a number of gestures from the user, to perform
some Smartification tasks or to control other devices. Such
smart object, called REGALS (Reconfigurable Gesture
based Actuator and Low Range Smartifier), is a versatile
interaction tool and can be used in different applications,
thanks to the ability to reconfigure its functionality
according to the user preferences or the context inherited
from the Smart Space.
In Sect. 2, we provide background and related work
information on tangible interfaces and Smart Spaces. In
Sect. 3, we introduce the key concepts and the software
architecture of the SOFIA Smart Environment by giving
particular remark to the ontological model we used to
represent the gestural interaction. In Sect. 4, we introduce
the REGALS, i.e., the prototype we have realized to
demonstrate our approach, describing its hardware archi-
tecture, the algorithms for gesture recognition and its
behavior during usage in SOFIA use cases. Before drawing
conclusions in Sects. 6, in Sect. 5, we show the results of
the performance evaluation for the gesture recognition
algorithms and we describe in detail an implemented use
case for controlling a media player in a smart environment.
2 Background
In the proposed vision, smart environments have an asso-
ciated digital representation shared by multiple devices.
Smart environments support dynamic and context-aware
applications where artificial entities may collaborate and
adapt to the user needs with minimal visibility or impact on
user life. This scenario was first envisaged by Weiser [1]
who, analyzing the state of the art and the current trend of
technology advances, retained possible the integration of
multiple low-cost collaborating computers in a single
localized network. Nowadays, processors embedded in
everyday life objects are able to seamlessly connect users
with a wider information world, where user actions, as
1 This research is being developed within the framework of the
SOFIA project of the European Joint Undertaking on Embedded
Systems ARTEMIS http://www.sofia-project.eu.
944 Pers Ubiquit Comput (2012) 16:943–956
123
intuitive and simple as possible, or his/her implicit
behavior, are reflected into changes in the information
world itself [5].
Requirements to implement this scenario are:
– a Context Management System (CMS) accessible by all
the relevant interacting entities (i.e. smart objects,
devices and spaces), providing their context and being
able to drive their behavior in the most appropriate
way;
– a concrete interaction model hiding the complexity of
the system and its subsystems, supporting (or being
general enough to support) facilities for configuration,
personalization, extensibility and upgrades.
In recent years, several research studies focused on CMS
by proposing new solutions for specific domains or
classifying them according to different aspects, from
system architectures to information storing techniques.
CMS has been proposed in the field of cultural heritage [6]
or for supporting mobile applications [7]. There are also
frameworks helping developers to create context-aware
applications [8] and architectures usable in broad domains
[9].
Among the technical features, which distinguish differ-
ent CMS at the basis of a smart environment implemen-
tation, one of the most relevant for the present discussion is
the information representation and modeling. Information
is usually collected by devices provided with sensory
equipment and sent through some protocol to the storage
system which, in its turn, allows the software agents to
access the contents. Some of the most common information
representation techniques are key value pairs (e.g. used in
[10]), markup scheme based on a derivation from SGML
[11], logic based on expressions assertions and rules [12],
ontologies [13].
The work described herein is based on Smart M3 [14]:
the interoperability platform that is under development in
the SOFIA European project. A Smart M3-based smart
environment is domain agnostic, and since its information
is related to an ontology, applications and devices may
use a common vocabulary to enhance the interaction with
the user.
A smart environment has to be manipulated by the
humans to be accessible. For this reason, the interaction
models between the smart environment and the user are
one of the most important field of research in the context of
smart environments and not only. To improve the quality
of the interaction with the surroundings and with the
objects, natural 3D interaction paradigms are needed to
deal with the intrinsic immersive nature of Ambient
Intelligence. These paradigms are taken into account in the
design and creation of Tangible User Interfaces (TUIs),
which introduce tangible devices or smart objects that
augment the real physical world by coupling digital
information to everyday objects. Smart objects are portable
devices with a vast range of functions: their common fea-
ture is on-board integration of sensors, actuators and some
processing and communication capability. Furthermore, the
opportunity to execute on-board complex processing tasks,
such as gesture recognition algorithms, without the need to
send data streams from the local sensor to a central base
station, results in extended battery life, improved system
scalability and easier handling of mobile TUIs [15].
An example from literature of this is the Elope system
[16], which combines advanced mobile devices, interactive
spaces, and tagged objects to enable the complete config-
uration of an office-based Smart Space (based on a simple
user action), including launching the desired application
and loading a user’s personal data. Another example is the
work of Kranz et al. [17], which introduces netgets, spe-
cialized networked gadgets with sensors and actuators that
let users seamlessly manipulate digital information and
data in the context of real-world usage. GeeAir, a Wii-
Mote-based handheld device is presented in [18], for
remotely controlling home appliances. It implements a
mixed modality interface, composed of speech, gesture
recognition, joystick and buttons, outlining the need for the
availability of a set of different paradigm and means to
interact with the environment. In this case, an IR-Bluetooth
adapter is used to communicate with legacy devices and the
multimodal capabilities improve the usability, especially
addressing elderly and disabled people as possible users.
However, these works have the limit of interacting with a
‘‘passive’’ Smart Space, which is not pro-active in deter-
mining the configuration of the digital entities and in
changing their behavior depending on available informa-
tion or user preferences.
Important features that a smart environment should
provide to enhance interaction are the configurability and
the adaptability to different profiles. Once an interacting
object is configurable, it is possible to setup its interface in
order to communicate with other devices belonging to the
smart environment or to perform different tasks. The
adaptability consists in the possibility of doing the same
actions with different modalities (e.g. adapting the inter-
face to a user profile).
The proper usage of domain ontologies in a context-
aware architecture allows many different functionalities
(configurability). This is obtained by relating devices
internal events to different semantics according to the spe-
cific configuration. In Sect. 3, the interaction ontology will
be detailed, which implements concepts and relationships to
enable a controller to adapt to different usage profiles.
Some examples of this approach are currently being
studied in the SOFIA project. The work of [19] discusses
this fact, i.e. how to represent low-level interaction events
Pers Ubiquit Comput (2012) 16:943–956 945
123
to provide easy controls of and in a smart environment. In
[20] and [21], these concepts are implemented: in the
former, a camera-equipped handheld device is used to
assign digital information to a physical object, identified by
its picture. In the latter, a specific interaction device is
proposed: the Interaction Tile allows users to explore and
create semantic connections between different devices in
the smart environment.
The work here described contributes to this research
scenario not by simply providing a device integrated in
existing software architecture, but demonstrating a new
approach to the interaction between a human and the sur-
rounding environment. By mean of a semantic representa-
tion at information level of all the relevant concepts, the
gestures performed are separated from the function they
activate by providing complete adaptability to tasks and
preferences. The same object is able to identify RFID codes
or to recognize gestures, and through its wireless connec-
tivity will upgrade a smart environment not only with a new
functionality, but more generically with gesture-based nat-
ural interaction. Two different tasks have been implemented
(namely the ‘‘Smartification’’ and ‘‘Remote Control’’ tasks)
and will be described, but many other are possible. Con-
tributions from different research areas, such as ontology-
based interoperability in smart environments and natural
interaction (e.g. gesture recognition, tangible interfaces),
are combined to demonstrate, through the example of a
working prototype, the benefits of such merging.
3 Definitions and SOFIA ontology
In this section, we introduce the main entities in a Smart
M3-based [14] Smart Space where the REGALS will
perform its actions. Smart M3 is a CMS conceived by
Nokia to enable multi-vendor multi-device and multi-part
applications through interoperability at different levels of
abstraction. Without going into detail, there are two layers
of interoperability relevant for the current implementation:
the communication-level interoperability allows the dis-
tributed processes to communicate with a central entity,
while the interoperability at information layer is a quality
of the systems where the different interacting entities are
able to understand each other without ambiguities. Figure 1
shows the Smart Space software architecture and various
ways of connecting existing components to the smart
platform. The aim of the Smart M3 interoperability plat-
form, in fact, is not only to give a set of rules that, if
respected, allow developers to create multiplatform appli-
cations, but also to give methodologies to connect existing
technologies to the Smart Space. The compatibility with
legacy devices enables the early creation of applications
and services and a wider range of target platforms.
An adapter knowledge processor (KP) is a software
program, which queries, subscribes and inserts in the Smart
Space all information relevant to the legacy device con-
sidered. In (a), the adapter KP is directly embedded in the
legacy device turning it into a Smart Space-enabled device.
This clean solution is not always applicable because the
target device could be not programmable or resource
constrained. In this case, other solutions are possible like
implementing the adapter in the platform hosting the SIB
and to which the legacy is connected (b) or in a helper
device (c) as it naturally happens for wearable sensors [22].
In (d), a situation is represented, which happens for mod-
ular devices. With this kind of hardware, the KP runs in the
module controlling the wireless interface, while the other
modules are interfaced to the system through drivers
(D) that take into account the domain specification in
semantic format. This is the case of the REGALS that will
be detailed later.
Interoperability at communication level is obtained
through the Smart Space Access Protocol (SSAP) used by
all the devices to interact with the central context reposi-
tory, called Semantic Information Broker (SIB). The
communication between two devices is compliant with the
software architecture if mediated by the SIB: it also pro-
vides an efficient support for subscription-notification
mechanism. The interoperability at information level is
provided by the adoption of machine readable technologies
for data representation. In particular, the RDF [23] triple
hsubject, predicate, objecti is used as the ele-
mentary piece of information in which the subject and
the predicate are represented by unique identifiers
(URI), while the object may be a URI or a literal value.
OWL [24] ontologies specify from a higher perspective the
domain of interest in a way that, by referring to them,
software developers are able to create KPs consistent with
data semantics.
Fig. 1 Representation of SOFIA architecture and knowledge
processors
946 Pers Ubiquit Comput (2012) 16:943–956
123
Ontologies are knowledge representation models, based
on labelled directed graphs, aimed at describing a domain
of interest in a machine readable format. Ontologies can be
used for many purposes as defining a common language on
which multivendor device may be based to communicate; it
is also possible to reason over context, since it is possible
to associate with an ontology the description logics [25]
or to do consistency checks to verify that the set of the
triples asserted in the Smart Space corresponds to a valid
configuration.
A good ontological solution describes the relevant
entities, their properties and their relationship until a level
of detail dependent by the application needs. In our
ontology development, we worked in a collaborative sce-
nario where different working groups worked on different
aspects of the target smart environment. The natural
property of ontologies to be extendible and well under-
standable, thanks to the many software instruments avail-
able for editing and visualization, allows this kind of
modular and collaborative development.
The interaction ontology or, to better say the semantic
infrastructure, is used to allow interaction with a generic
appliance through the insertion in the device’s state of the
recognized RFID and gestures. The ontological model to
control an appliance in a SOFIA smart environment has
been realized by Niezen et al. [19], as a derivation from the
considerations made by the recently established W3C Web
Events Working Group [26]. The W3C Events Working
group has defined a layered conceptual model for interac-
tion with multi-touch and pen-tablet input devices. The
four layers have different levels of abstraction: starting
from the physical layer to a more abstract representational
layer; among these three are used to construct the current
model for the REGALS behavior, this because our device
is simpler than a generic SOFIA device and than the target
of the conceptual modeling from which the interaction
ontology was originated.
In Fig. 2, we report a piece of ontology that has been
used when modeling the REGALS as a SOFIA interacting
device. As it will be detailed in Sect. 4, the REGALS will
be an instance of smart object provided with a certain
number of interaction primitives; some of the interaction
primitives are mapped to the events that represents the
functionalities the user wants to activate by performing the
corresponding gesture. With reference to the previously
cited levels of abstraction, the interaction primitives are
something in the middle between the phisical layer and the
gestural layer because they are related to the gestures
physically performed and to their mapping to the actions;
the intentional layer, related to the actions the user wants to
perform, is represented by the event class. Events may be
activated through different means, such as a button press,
gesture or voice command, all referring to the same
intention. A semantic transformer is responsible for inter-
preting multimodal interaction data coming from different
smart objects and data sources into possible user goals and
maps them onto the plurality of available services. This
high level of abstraction enables developers to write
applications, which will work across different devices and
services, without having to write specific code for each
possible input device or used service.
In this work, we will show how the Interaction Ontology
can be used to describe a category of gesture based and
reconfigurable smart objects. The specificity of these
devices is that they are not committed to a unique inter-
action domain, so that they are inherently very general and
their semantic transformers are mapped by the Smart
Space. In other words, a KP can dynamically assign it a
function, which therefore is inherently context dependent.
Through the implementation of a specific device, the
already mentioned REGALS, we will show how the
Interaction Ontology may be used to tailor its functionality:
new functions may be assigned by the Smart Space to the
REGALS just by adding new mappings between gestural
and intentional events.
4 REGALS
REGALS (Reconfigurable Gesture based Actuator and
Low Range Smartifier) is a smart object, seen as interaction
device, which enables the user to interact with a smart
environment, through objects that have RFID tags or a
digital interface toward the SIB.
After reading the RFID tag of an object, the REGALS
can change its digital properties, which are stored in the
Fig. 2 Sofia interaction
ontology: view of the most
relevant classes and properties
used for REGALS development
Pers Ubiquit Comput (2012) 16:943–956 947
123
SIB (Fig. 3a). If the object has a network interface to the
SIB, it can also receive commands or information from it,
as shown in Fig. 3b below. This device has gesture rec-
ognition capabilities, enabled by an accelerometer which
senses its movements and provides a simple and natural
way of interaction with the user.
4.1 Physical layer: hardware and software description
The first prototype of the device is based on a GUMSTIX
board and is equipped with an RFID reader, an inertial
sensor module (microcontroller and accelerometer), three
LEDs and a PWM vibro motor to provide feedback to the
user. Figure 4 summarizes the block diagram of the hard-
ware and firmware blocks.
The control unit is the Gumstix Verdex Pro XL6P COM
board [27], which has a Marvell PXA270 600MHz pro-
cessor, 128MB SDRAM memory and several connectors
for expansion boards. Ethernet and Wi-Fi capabilities are
provided via an additional module and allow the device to
communicate with the SIB. It runs a Linux kernel, opti-
mized for the embedded processor; implements the main
features of the KPs, taking care to maintain the wireless
connection with the SIB and coordinating the activity of
the other components.
The RFID reader is used to identify the SS entity to
interact with. The current REGALS implementation uses
the TagSense Nano-UHF [28]. It operates at UHF fre-
quencies (865–868 MHz, 902–928 MHz) and reads and
writes RFID tags following the EPC Gen 2 standard. The
6dbi dipole antenna is connected externally by means of a
SMA Jack.
The gesture recognizer embeds a low-cost, low-power 8-
bit microcontroller (Atmel ATmega168) and a MEMS tri-
axial accelerometer (STM LIS3LV02DQ) with a pro-
grammable full scale of 2 or 6g and digital output. This
module is connected to the Gumstix with a serial connec-
tion and implements a message-based communication
protocol to exchange information. The accelerometer sen-
ses the orientation and the movements of the device, which
are Physical Level events and are the actions taken from
the user to interact with the device. The microcontroller
implements a gesture recognition algorithm and transforms
raw movements into Interaction Primitives passed to higher
levels.
The firmware part, which runs on the Gumstix, is made
up of a Python framework. The framework is formed by a
folder (called Engine) which contains the legacy adapters
(at present the RFID Reader and the Inertial Board adapt-
ers), the SSAP libraries and the run program. The last one
launches all the communication functionalities and sear-
ches for the applications eventually contained in the
Applications folder that can be performed. Every appli-
cation is actually an implementation of a KP for the device,
and they can be navigated and launched by means of a
menu. As we will see in the Sect. 4.3, currently the
Smartification and the Remote Control applications are
implemented as example. This software architecture
enables a modular and dynamic development of new
functionalities: new applications can be easily added or
executed from the Applications folder using low-level
functionalities of the Engine folder.
Fig. 3 Representation of the
interaction model of the device:
a the user reads the RFID tag of
a device and exchanges its
information with the SIB; b the
device can be controlled via SIB
Fig. 4 Hardware blocks of the smart object
948 Pers Ubiquit Comput (2012) 16:943–956
123
4.2 Gestural layer: algorithm
Gestures executed with natural hand and arm movements
are variable in their spatial and temporal execution,
requiring classifiers suited for temporal pattern recognition.
Typical approaches include Dynamic Time Warping
(DTW) [29], Neural Networks [30] and Hidden Markov
Models (HMMs), [31–33]. HMMs are often used in
activity recognition since they tend to perform well with a
wide range of sensor modalities and with temporal varia-
tions in gesture duration. Several variants of HMMs have
been proposed to recognize inertial gestures: in [31],
5-state ergodic discrete HMMs are evaluated with the Vi-
terbi algorithm to classify gestures performed with a
handheld sensor. The work of Mantyla et al. [32] uses
7-states Left-to-Right models and the forward algorithm to
classify actions performed with a mobile phone equipped
with an accelerometer. Both implementations have similar
performance and rely on a PC to execute all computations.
In our work, we are using low-power hardware without a
floating point unit, thus we implemented a fixed-point
variant of the forward algorithm, presented in a previous
work [33].
Using HMMs to classify gestures from a continuous
stream of data brings another issue to solve: the recognition
procedure needs to discriminate actually executed gestures
from all the other arbitrary movements. Hoffman et al. [34]
use a sensorized glove to recognize hand gestures: to
segment the data stream, they compute the velocity profile
of the sampled accelerations and apply a threshold to
identify the motion segments. In [35], a Gaussian model of
the stationary state is used with a sliding window approach
to find pauses in movements, which identify the beginning
and the end of a gesture. While those works have focused
to develop segmentation and recognition solutions, none of
them deals with computation or memory limited devices.
We found a similar solution implemented on a wristwatch
device, using a 32-bit ARM microcontroller [36], but there
are no works targeting low-cost, low-power 8-bit micro-
controllers, such that the Atmel ATmega168 used in this
work (Fig. 5).
In our case, gestures can occur only at specified
moments (i.e. after reading an RFID tag) and the device
signals to the user when to execute a gesture. All gestures
begin and end with the user holding still the device in the
same position, and this condition is used to fine segment
only the desired gesture. The recognition algorithm trans-
forms raw movements into interaction primitives and can
trigger higher-level events. To improve the usability and
the recognition performance, we choose a limited set of
gestures, such as changes in orientation, continuous rota-
tions of the device around a horizontal axis, or sample
directional movements, as illustrated in Fig. 6.
The main feature used for gesture recognition is the
direction of the movement, represented by the direction of
the acceleration vector, which is sampled at each frame.
This information is obtained converting the 3D accelera-
tion vector {ax, ay, az} in spherical coordinates, and using
only the two angles {/, h}. The two angles calculated,
which identify the arbitrary 3D orientation of a unitary
vector, are quantized to the nearest vector of a 26 entry
uniform codebook by a minimum distance classifier. To
efficiently compute the two angles of the acceleration
vector, we used an implementation of the CORDIC algo-
rithm. Using the notation in Fig. 7, this algorithm first
estimates the phase / and the magnitude r0 of the complex
number (ax ? iay), then again estimates the angle /, using
r0 and az. All computations are done with integer values,
giving us a resolution of 1� and a maximum error of 2�,
which is acceptable since we are dealing with human
motion and we do not need higher accuracy.
The off-line HMM training phase builds a model for
each of the gestures to recognize, using sample instances of
the gestures. We used the Baum–Welch algorithm and
initialized the training models with several random prob-
ability distributions. Among the resulting HMMs, those
with the lowest training error were chosen. The on-line
recognition algorithm evaluates the executed gesture with
all of the trained models and selects the model with the
highest probability. For this purpose, we used a fixed-point
version of the forward algorithm, as implemented in [37].
This implementation deals with the lack of a division unit
in the low-power microcontroller embedded in the device
Fig. 5 The REGALS prototype Fig. 6 Gestures recognized by the algorithm
Pers Ubiquit Comput (2012) 16:943–956 949
123
and proposes a different scaling procedure that uses shifts
and a logarithmic representation of the probabilities. Our
previous work compared the performance of this imple-
mentation against a standard floating point algorithm. The
results showed that a 16-bit fixed-point algorithm has the
best trade-off between classification rate and computational
complexity.
4.3 Mapping on intentional layer: application cases
In this section, we will see how the Semantic Interaction
Ontology and the higher-level interaction layer, introduced
in 3 can be implemented on a smart object such is the
REGALS. The Smart Space may be used to enhance and
control its functionality, selecting different applications or
changing the semantic transformations between the four
layers.
The modular software architecture adopted enables a
dynamic selection between all the available applications,
changing the overall functionalities of the device. Having
different applications for different tasks is a standard way
to add flexibility to a device, but thanks to the support of a
Smart Space, this device can be reconfigured even without
changing the application. Using the Semantic Interaction
Ontology and the abstraction layers between user actions
and desired events in the digital world, transformations
between gestures and events can be chosen every time,
according to the needs of the application or the user,
making the device fully reconfigurable. This selection is
performed by the SIB and can be implemented depending
on the RFID tag of the selected device or using context
information from the Smart Space. Two different applica-
tions have been developed for the smart object, so far:
Smartification and Remote Control.
Smartification is the word we use to name the process
of creating the digital representation of some physical
objects: the Smart Space is made aware of the relevant
physical world entities, their identification and the spatial
relationships between them. In this scenario, all the phys-
ical objects involved are tagged with a unique RFID tag.
To create a digital representation of one of those objects,
we need to pair the object with a unique identifier. Figure 8
represents the ontology graph that is created during the
Smartification process.
Figure 9 represents the sequence diagram of the inter-
actions between the Smartification involved actors: after the
selection of the application, with the corresponding gesture,
the user selects a desired object, reading its RFID tag. If it is
not already in the SIB, then its digital representation is
created, as an instance of the Object class (Object_
UUID) coupled with an instance of the Identifica-
tionData class, which has the value of the RFID tag. To
create positional information, the user can identify a room
by reading the corresponding RFID tag and performing the
addRoom gesture. This is equivalent to create an instance
of the Environment class (Room_UUID) and to asso-
ciate with this an instance of the IdentificationData
class (Identification_UUID) which has the value of
the room RFID tag. To locate the object into the room, after
the selection of the Smartification application, the user
reads the room and the object identifiers; this association is
Fig. 7 Representation of the acceleration vector and the discrete
codebook
Fig. 8 Smartification ontology
950 Pers Ubiquit Comput (2012) 16:943–956
123
created by means of the ContainsEntity property,
which connects the room and the object.
In the Remote Control application, REGALS is used to
control a digital device, which has an interface to the SIB
and is identified by an RFID tag. In this scenario, the user
selects the device reading its tag and then performs the
gesture corresponding to the command he wants to execute
(e.g. the user selects a Media Player Device and performs a
gesture to start playing some music). The Fig. 10 repre-
sents the flow diagram of the operations executed during
this application. After the selection of the application, the
user reads the desired devices tag, so that the SIB associ-
ates the REGALS with the device to control using the
connectedTo propriety. The user can now control the
device using the appropriate gestures. Having the SIB to
handle all the communications and the semantic transfor-
mations, the REGALS is compatible with each device that
has an interface toward the SIB, implemented through an
appropriate KP.
The Remote Control application is also an example of
the reconfigurable interaction proprieties of this device,
when interfaced with the Smart Space. According to the
Semantic Interaction Ontology, the smart object has a set
of Interaction Primitives and every controlled object has a
set of actions. Depending on the controlled device selected
and context information stored in the SIB, the Gesture
versus Interaction Primitive versus Action mapping is
reconfigured. The dynamic reconfiguration of the REGALS
is composed of the following steps:
1. Smart object initialization. When the smart object joins
a new Smart Space, it uploads its digital representation
to the SIB. In this representation, the most relevant
information are the Device Identifier and the available
Interaction Primitives.
2. Connection between the smart object and the con-
trolled device (e.g. a Media Player). The user selects
the device to control by reading its RFID tag. The
digital representation of the controlled device is
supposed to be already in the SIB.
3. Smart object reconfiguration. The Interaction Primitives
are mapped to the actions supported by the controlled
device. This mapping can be influenced by different
factors, such as user preferences, controlled device status
or other context information available from the smart
environment.
The above described steps can be recognized in the
following Fig. 11 where a simplified example of an
ontology graph is shown, reproducing the remote control
application. In the graph, the device, identified as Smart-
Object_UUID, has joined the Smart Space and has been
connected to the controlled device MediaPlay-
er_UUID, after reading its tag. Therefore, the SIB has
created and added the interaction ontology sub-graph
regarding the mapping between the available interaction
primitives and services, shadowed in the figure. In this
example, the smart object has one Interaction Primitive,
the gesture MoveDown and it has been associated with the
event PlayEvent, using the semantic transformation
canBeTransformedTo. Now when the user performs
the gesture MoveDown, the SIB triggers the PlayEvent
and the player executes the correspondent action, playing
some media.
5 Experimental tests
After the construction of the prototype, we tested its main
functionalities. First, we did a validation of the gesture
recognition algorithm, to ensure the correct behavior of this
Fig. 9 Activity diagram of the smartification application
Fig. 10 Action diagram for the remote control application
Pers Ubiquit Comput (2012) 16:943–956 951
123
input method. Second, we set up a SIB and used the
remote-control application with example devices. We
simulated some context and user information to verify the
correct integration between the device and the Smart
Space.
5.1 Gesture recognition tests and results
For the validation of our algorithm, we used a set of 7
gestures, illustrated in Fig. 6. All gestures are formed by
natural movements, they start and end with the user hold-
ing the cube in the same position and are executed on the
vertical plane in front of the user, holding the device every
time with the same orientation.
We collected gestures executed by four people, all male
students with an average age of 26 years. To build and
validate the HMMs each user executed 80 instances of
every gesture, during different days. Those gestures were
continuously executed, with a few seconds of interval
between two consecutive instances. Gestures performed by
three users were used for both modeling and validation,
while gestures from the fourth user were used only to test
the models. The whole dataset was collected with the
REGALS prototype and stored in a PC. No feedback from
the device or the PC was given to the users during the
execution of the gestures. To easily test the performance,
the algorithm was implemented in Matlab, taking care to
simulate the computational constrains of the 8-bit micro-
controller and using only integer computations with con-
trolled variable size.
In the first place, we used the collected dataset to train a
set of HMMs for each user, applying the floating point
notation with double precision. Each model has been
trained using 15 reference instances, 15 loops for the
Baum–Welch training algorithm and 10 initial random
models. The floating point models were then converted
in fixed point, represented only by 16-bit integers. Each
user’s models were validated with his/her own gestures, not
used in the training phase, and with gestures from other
users.
The results show how the classifier performs well in a
single user scenario with recognition rates up to 99.7%.
The algorithm has some limitations in a multi-user sce-
nario, particularly when recognizing gestures from a user
with models trained by another user. We found that inter-
polating the trained models with a uniform, one gave some
advantages and Table 1 shows the classification rates for
the various users in this case; Table 2 shows the classifi-
cation matrix in the best case.
To overcome the limitations in the multi-user scenario,
we put together all the gesture instances, regardless to the
user who executed them, and build a global model for each
gesture. These models were trained using 15 randomly
chosen gestures and validated on 200 gestures from four
users. The results of this case are presented in Table 3,
where we can observe how in this case we have a perfor-
mance comparable to the single user scenario, achieving a
classification rate of 94.6% despite we are classifying
gestures from all the users. We can also notice that the
fixed-point implementation has performance comparable to
Fig. 11 Remote control ontology
952 Pers Ubiquit Comput (2012) 16:943–956
123
the floating point one, and our algorithm is suitable for
low-performance smart objects.
5.2 Practical application: remote control
As seen, a Smart Space plus the use of the Semantic
Interaction Ontology turns the REGALS from a normal
interaction device into a reconfigurable smart object,
potentially enhanced with context information and the user
profile and preferences. These features have been demon-
strated by means of two applications, Smartification and
Remote Control, and we here present the second one to
illustrate the functionalities of the device.
The simple application scenario used to evaluate the
system is structured as follows: the SIB runs on a PC and
manages all the connections between the devices; the same
PC runs a KP to enable the control of some LEDs, con-
nected through a serial adapter, and a digital frame with
Ethernet connection runs a KP, becoming another con-
trolled device. The LEDs, simulating a generic lighting
device, are an example of a legacy device which is inter-
faced to the SIB through an adapter KP; while the digital
frame, which has been reprogrammed to run a simple
media player and the corresponding KP, is an example of a
Smart Space-enabled device. Both devices have an asso-
ciated RFID tag to identify them, and some actions that can
be performed by the user (e.g. LED on/off or play, stop,
next, previous for the digital frame). To test the reconfig-
urable proprieties of the REGALS, we created two user
profiles, each having different gesture mappings to the
various actions available.
Each user performed a simple set of actions, consisting
of turning on and off the LEDs, starting and stopping some
media on the digital frame (e.g. videos, pictures slide-
show). When a user selects one of the two devices, the
corresponding gestures mapping is activated, so that he/she
can control it in the desired way. When the same user
selects the other device, the gestures mapping is automat-
ically updated by the SIB to control the new device, thus
the interaction proprieties change not only according to
user preferences, but also depending on the device
involved. The rest of this section illustrates more in detail
the actions and events performed during the interaction
with one device, presenting the example of the media
player control (Fig. 12).
The digital frame runs a media player application, has an
Ethernet interface by which communicates with the SIB
and it runs a consumer KP which is subscribed to the
InteractionsPrimitive events and performs the
actual MediaPlayerEvents.
In Fig.13, a screenshot of the SIB explorer is shown at
the end of a PlayEvent action, according to the Remote
control ontology graph presented above. We highlight the
most important passages:
1. Smart object initialization:
(a) The REGALS inserts its instance (Smart
Object_01)
(b) The REGALS inserts the InteractionPrim-
itive sub-graph. In this case: MoveDown,
MoveUp, MoveLeft, MoveRight
2. Media player initialization:
(a) A MediaPlayer instance MediaPlayer_01 is
inserted, with its identification (Identification_
01)
(b) The media player inserts the available events
3. The smart object connects to the media player using
the connectedTo property: hSmartObject_01,connectedTo, MediaPlayer_01i
Table 1 Classification rates in multiuser scenario
Training set Validation set
User 1 User 2 User 3
User 1 0.99 0.66 0.85
User 2 0.94 0.92 0.85
User 3 0.93 0.71 0.99
Table 2 Classification rate in best case
Gest. Classified as
Up Right Down Left Cir. Sq. X
Up 62 0 0 0 0 3 0
Right 0 65 0 0 0 0 0
Down 0 0 65 0 0 0 0
Left 0 0 0 65 0 0 0
Circle 0 0 0 0 65 0 0
Square 0 0 0 0 0 65 0
X 0 0 0 0 0 0 65
Table 3 Classification rate for global model
Gest. Classified as
Up Right Down Left Cir. Sq. X
Up 194 0 3 0 2 1 0
Right 0 187 1 11 0 1 0
Down 1 0 199 0 0 0 0
Left 0 1 0 199 0 0 0
Circle 0 0 0 4 177 19 0
Square 0 0 0 0 13 187 0
X 3 0 12 0 1 2 182
Pers Ubiquit Comput (2012) 16:943–956 953
123
4. The InteractionPrimitives are mapped to the
events using the canBeTransformedTo property.
In this case, they arePauseEvent, PlayEvent,
NextTrackEvent, PreviousTrackEvent
5. Smart object in action:
(a) The smart object is subscribed to dataValue of
its InteractionPrimitive.
(b) When the user executes one of the gestures, the
smart object creates and launches an Event, in
this case Event_01.
(c) The smart object assigns to the Event one
EventType with respect to the selected gesture
and the semantic mapping, which has been
configured by the SIB at the time of the
connection.
In this example, we have recognized the MoveDown
gesture, which is transformed to a PlayEvent and
corresponds to the Play action on the media player, which
starts displaying the desired media content.
6 Conclusion
In this paper, we presented an example of how next gen-
eration smart environments, based on semantic technolo-
gies and context management systems, are suitable for
multimodal and natural TUIs. From a well-known model
proposed for ontology-based interaction, it was possible to
devise a set of interaction mechanisms and information
flows within the smart environment developed on Smart
M3, which made available all different kinds of actuators
to be operated in a natural and intuitive way. Furthermore,
Fig. 12 The REGALS and the digital frame
Fig. 13 Screenshot of the SIB
explorer, showing a running
instance of the SIB, with
highlighted operating steps as
described in the text below
954 Pers Ubiquit Comput (2012) 16:943–956
123
we presented REGALS, a prototype physical device and
put these principles in practice by using low-cost pro-
grammable components and smart on-board algorithms to
perform natural and tangible interaction paradigms, in
specific enabling the use of gestures.
The demonstrative features implemented were chosen to
highlight the generality of the interaction model applied to
REGALS and of the support provided by the Smart Space:
the Smartification works only at information level to pro-
vide to the other Smart Space agents a more complete
knowledge, while the controller functionality uses the
available context (e.g. controlled entity, user profile) to
properly adapt its behavior according to it. The analysis of
the gesture recognition algorithm showed that the perfor-
mance is adequate for a smooth interaction, while, at the
same time, the implementation of advanced algorithms on
the object itself opens the possibility to manage appropri-
ately an increased number of devices and their power
resources, to prolong device lifetime, by minimizing wire-
less transmission. This specific aspect, which has a direct
impact on device usability and comfort, will be addressed in
future work, along a more exhaustive study of the usability
of the device when integrated in a Smart Space.
Acknowledgments The Authors would like to thank Luca Faggia-
nelli for his work and assistance on the realization of the REGALS
prototype. This work was carried out within the framework of a
project of the European Joint Undertaking on Embedded Systems
ARTEMIS. The project is called SOFIA (2009–2011), it is coordi-
nated by NOKIA and it is co-funded by the EU and by National
Authorities including MIUR, the Italian Central Authority for Edu-
cation and Research.
References
1. Weiser M (1999) The computer for the 21st century. SIGMO-
BILE Mob Comput Commun Rev 3(3):3–11
2. Monekosso DN, Remagnino P, Kuno Y (2009) Intelligent envi-
ronments: methods, algorithms and applications. In: Monekosso
D, Kuno Y (eds) Advanced information and knowledge pro-
cessing, 1st edn. Springer, Berlin, p 211. http://www.springer.
com/computer/ai/book/978-1-84800-345-3
3. Lopez T, Ranasinghe D, Patkai B, McFarlane D (2009) Taxon-
omy, technology and applications of smart objects. Inform Syst
Front 1387(3326):1–20
4. Bartolini S, Roffia L, Salmon Cinotti T, Manzaroli D, Spadini F,
DElia A, Vergari F, Zamagni G, Di Stefano L, Franchi A, Farella
E, Zappi P, Costanzo A, Montanari E (2010) Creazione autom-
atica di ambienti intelligenti. Patent Pending, March 2010,
BO201A000117
5. Schmidt A (2000) Implicit human computer interaction through
context. Pers Ubiquit Comput 4(2):191–199
6. Ryan N (2005) Smart environments for cultural heritage. In:
Takao UNO (ed) Reading historical spatial information from
around the world: studies of culture and civilization based on
geographic information systems data
7. Salmeri A, Licciardi CA, Lamorte L, Valla M, Giannantonio R,
Sgroi M (2009) An architecture to combine context awareness
and body sensor networks for health care applications.
International Conference on Smart Homes and Health Telemat-
ics. Springer, Berlin, pp 90–97
8. Dey A, Salber D, Abowd G (2001) A conceptual framework and
a toolkit for supporting the rapid prototyping of context-aware
applications. Hum Compt Interact 16(2, 3 and 4):97–166
9. Honkola J, Laine H, Brown R, Oliver I (2009) Cross-domain
interoperability: a case study. In: International conference on
smart spaces and next generation wired/wireless networking and
2 conference on smart spaces (NEW2AN09 and ruSMART09),
pp 22–31
10. Schilit WN (1995) A system architecture for context-aware
mobile computing. PhD thesis, Columbia University
11. Overview of SGML resources: http://www.w3.org/MarkUp/
SGML
12. Ranganathan A, Campbell RH (2003) A middleware for context-
aware agents in ubiquitous computing environments. In: Endler
M, Schmidt DC (eds) Middleware, vol 2672 of lecture notes in
computer science, pp 143–161
13. Gruber TR (1993) A translation approach to portable ontology
specifications. Knowl Acquisition 5(2):199–220
14. Smart-M3 public source code: http://sourceforge.net/projects/
smart-m3/
15. Mantyjarvi J, Paterna F, Salvador Z, Santoro C (2006) Scan and
tilt: towards natural interaction for mobile museum guides. In:
Conference on human-computer interaction with mobile devices
and services, pp 191–194
16. Pering T, Ballagas R, Want R (2005) Spontaneous marriages of
mobile devices and interactive spaces. Commun ACM 48(9):
53–59
17. Kranz M, Holleis P, Schmidt A (2010) Embedded interaction:
interacting with the internet of things. IEEE Internet Comput
14(2):46–53
18. Pan G, Wu J, Zhang D, Wu Z, Yang Y, Li S (2010) GeeAir: a
universal multimodal remote control device for home appliances.
Per Ubiquit Comput 14(8):723–735
19. Niezen G, Van der Vlist B, Hu J, Feijs L (2010) From events to
goals: supporting semantic interaction in smart environments. In:
The IEEE symposium on computers and communications,
pp 1029–1034
20. Franchi A, Di Stefano L, Cinotti TS (2010) Mobile visual search
using Smart-M3. In: The IEEE symposium on computers and
communications, pp 1065–1070
21. Van der Vlist B, Niezen G, Hu J, Feijs L (2010) Semantic con-
nections: exploring and manipulating connections in smart
spaces. IEEE Symp Comput Commun 1-4
22. Vergari F, Bartolini S, Spadini F, D’Elia A, Zamagni G, Roffia L,
Cinotti TS (2010) A smart space application to dynamically relate
medical and environmental Information. In: Design Automation
& Test in Europe (DATE10), pp 1542–1547
23. http://www.w3.org/RDF
24. http://www.w3.org/TR/owl-ref
25. Horrocks I, Kutz O, Sattler U (2006) The even more irresistible
SROIQ. In: International conference of knowledge representation
and reasoning, pp 57–67
26. http://www.w3.org/2010/webevents/charter
27. http://www.gumstix.com/store/catalog/product_info.php?products_
id=210
28. http://www.tagsense.com/index.php?option=com_content_&view
=article&id=142:nano-uhf&catid=49:uhf-readers&Itemid=117
29. Kim L, Cho H, Park SH, Han M (2007) A tangible user interface
with multimodal feedback. In: International conference on
human-computer interaction, pp 94–103
30. Bailador G, Roggen D, Trster G, Trivino G (2007) Real time
gesture recognition using continuous time recurrent neural net-
works. In: International conference on body area networks
(BodyNets), Article n15
Pers Ubiquit Comput (2012) 16:943–956 955
123
31. Kela J, Korpip P, Mntyjrvi J, Kallio S, Savino G, Jozzo L, Marca
D (2006) Accelerometer-based gesture control for a design
environment. Personal Ubiquitous Comput 10(5):285–299
32. Mantyla V-M, Mantyjarvi J, Seppanen T, Tuulari E (2000) Hand
gesture recognition of a mobile device user. IEEE Int Conf
MultiMed Expo 1(c):281–284
33. Zappi P, Milosevic B, Farella E, Benini L (2009) Hidden Markov
model based gesture recognition on low-cost, low-power tangible
user interfaces. Entertain Comput 1(2):75–84
34. Hofmann FG, Heyer P, Hommel G (1997) Velocity profile based
recognition of dynamic gestures with discrete Hidden Markov
models. In: Gesture and sign language in human-computer
interaction, international gesture workshop. Springer, Berlin,
pp 81–95. http://www.springerlink.com/content/wju4v1620833
6502/about/
35. Chambers GS, Venkatesh S, West GA, Bui HH (2004) Seg-
mentation of intentional human gestures for sports video anno-
tation. In: International Multimedia Modelling Conference,
pp 124–130
36. Amstutz R, Amft O, French B, Smailagic A, Siewiorek D, Troster
G (2009) Performance analysis of an HMM-based gesture rec-
ognition using a wristwatch device. Int Conf Comput Sci Eng
02:303–309
37. Milosevic B, Farella E, Benini L (2010) Continuous gesture
recognition for resource constrained smart objects. In: Proceed-
ings of the fourth international conference on mobile ubiquitous
computing, systems, services and technologies, UBICOMM
2010. pp 391–396
956 Pers Ubiquit Comput (2012) 16:943–956
123