[ieee 2014 ieee international conference on pervasive computing and communication workshops (percom...
TRANSCRIPT
Bayesian Nonparametric Extraction of HiddenContexts from Pervasive Honest Signals
Thuong Nguyen
Centre for Pattern Recognition and Data Analytics, Deakin University, Australia
Email: [email protected]
Abstract—Hidden patterns and contexts play an importantpart in intelligent pervasive systems. Most of the existing workshave focused on simple forms of contexts derived directly fromraw signals. High-level constructs and patterns have been largelyneglected or remained under-explored in pervasive computing,mainly due to the growing complexity over time and thelack of efficient principal methods to extract them. Traditionalparametric modeling approaches from machine learning find itdifficult to discover new, unseen patterns and contexts arisingfrom continuous growth of data streams due to its practiceof training-then-prediction paradigm. In this work, we proposeto apply Bayesian nonparametric models as a systematic andrigorous paradigm to continuously learn hidden patterns andcontexts from raw social signals to provide basic building blocksfor context-aware applications. Bayesian nonparametric modelsallow the model complexity to grow with data, fitting naturally toseveral problems encountered in pervasive computing. Under thisframework, we use nonparametric prior distributions to modelthe data generative process, which helps towards learning thenumber of latent patterns automatically, adapting to changes indata and discovering never-seen-before patterns, contexts andactivities. The proposed methods are agnostic to data types,however our work shall demonstrate to two types of signals:accelerometer activity data and Bluetooth proximal data.
I. INTRODUCTION
The development of smartphone and wearable device tech-
nology has deeply changed our lives. To an extent, it realizes
the vision about pervasive computing of Mark Weiser [10].
One important part of Weiser’s vision is that the system can act
and respond adaptively to data changes, which are often due
to the context alteration. Therefore, the extraction of contexts
plays an important role in pervasive systems. Simple and low
level contexts such as location and physical activity can be
extracted directly from sensor data, collected by GPS, WiFi,
Bluetooth or accelerometer. However, these contexts might be
susceptible to changes without additional interpretation and
careful data cleaning. Furthermore, there are richer, but hidden
patterns in the data that are important for the operation of
such systems. These hidden patterns, once extracted, provide a
high level of contexts that help the system behave intelligently.
Examples include human interaction patterns, activity of daily
routines, and proximity of people.
However, deriving high-level contexts that are hidden in
the data is a challenging task. Typical methods dominantly
used fall under unsupervised learning – clustering and factor
analysis techniques such as Gaussian mixture models (GMM),
K-means, hidden Markov models (HMM), latent Dirichlet al-
location (LDA), principal component analysis (PCA), and ma-
trix factorization. These models are parametric, once learned
they are fixed and unable to grow or expand with changes in
data. These parametric methods might also not be suitable for
pervasive applications as the data is typically growing over
time with “no clear beginning and ending” [1].
To address these problems, this work explores the use of
Bayesian nonparametric models as a systematic and rigorous
paradigm to continuously learn hidden patterns and contexts
from raw social signals. The key feature of Bayesian non-
parametric models lies in their ability to expand the model
complexity to grow with the data; the number of latent patterns
can be learned automatically to adapt to changes in data and
discover unseen patterns. We will discuss how this modeling
framework can be applied to a wide range of applications as
well as provide initial results by analyzing the popular Reality
Mining dataset and a dataset collected in our own lab. We
apply the models on two types of data: accelerometer-based
activity data and Bluetooth-based proximal data.
II. DETECTION OF PHYSICAL ACTIVITY LEVEL USING
HIERARCHICAL DIRICHLET PROCESS
Physical activity has a high impact on human’s health. The
sedentary habit is one of the causes of increasing trends of
various diseases such as obesity, diabetes, high blood pressure,
and cardiovascular disease. Detection of physical activity level
can help to improve the health care monitoring as well as to
change the life-style to reduce the risks of these diseases.
One of the tasks of this work is exploring how to detect the
physical activity level in real-life setting. To address this, we
collect a new dataset using the sociometric badge [7] during
working hours in three weeks. The dataset is collected from 11participants who are members of our lab. We use the consis-tency feature as the physical activity indicator. The consistencyis defined as 1 minus the standard deviation of
√x2 + y2 + z2
during each second interval, where x, y, z are the three ac-
celeration values sensed by the triaxial accelerometer. The
consistency ranges from 0 to 1, where 1 indicates no changes
in activity, and 0 indicates the maximum amount of variation
in activity. Fig. 1 shows some examples of consistency values
collected during various activities. The consistency values of
sedentary activity, e.g. sitting, are generally close to 1, whilst
those of the intenser activities, e.g. walking or playing table
tennis, are closer to 0. The ground truth of the activities is
Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014
978-1-4799-2736-4/14/$31.00 ©2014 IEEE 168
0 200 400 600
1
2
3
4
Time (seconds)0 200 400 600
1
2
3
4
Time (seconds)
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 200 400 6000.4
0.5
0.6
0.7
0.8
0.9
1
1.1Sitting
0 200 400 6000.4
0.5
0.6
0.7
0.8
0.9
1
1.1Walking
0 200 400 6000.4
0.5
0.6
0.7
0.8
0.9
1
1.1Playing table tennis
Level 1
Level 2
Level 3
Level 4
0 200 400 600
1
2
3
4
Time (seconds)
Physical activity levels Consistency values
Inferred labels
Fig. 1: Physical activity levels detected by HDP (left) and the inferredlabels for three activities: sitting, walking, and playing table tennis.The colors of the inferred labels are the same as those of the physicalactivity components.
provided by the participants through the Magpi application1
on mobile phones.
We organize each activity sample as a group of consistency
values and model the distribution of consistency as a mixture
of normal distributions. We employ the hierarchical Dirichlet
process (HDP) [5], [9] to detect the physical activity levels.
This model can infer the number of levels automatically given
the observed data instead of specifying it in advance. Fig.
1 illustrates the normal distribution components inferred by
HDP, each component can be interpreted as a physical activity
level. These levels are highly correlated to the movement
intensity. The mixture proportions of each activity can be used
as features for other tasks such as activity recognition.
III. CONTINUOUS EXTRACTION OF PROXIMAL CONTEXTS
Understanding human interaction is important in many
applications. It can help to change the organizational structure,
keep track of the infectious disease spreading or monitoring
the interaction-related mental health problems, to name a few.
Bluetooth signals that yield information about other proxi-
mal Bluetooth devices give an opportunity to study the inter-
action in long term. In a community of people instrumented
with Bluetooth devices, proximal groups can be extracted
using graph based approaches such as K-clique or the Louvain
method. These approaches can provide only the final groupings
of the whole data, without information about user membership
over time. Thus, they are limited in stream data analysis.
Alternatives from machine learning, e.g. principal component
analysis (PCA) or latent Dirichlet allocation (LDA) can pro-
vide both groupings and user membership, but it requires
the number of groups to be pre-specified. This parameter is
difficult to identify in many applications. Moreover, it may
change dynamically in stream data and needs to be estimated
automatically and incrementally.
To address this problem, we propose to use two models fol-
lowing two different approaches: hierarchical Dirichlet process
(HDP) for clustering approach and Indian buffet process (IBP)
for factor analysis approach. We also propose the incremental
1https://www.magpi.com/
inference for these models to learn the context online. We
demonstrate the models using Bluetooth data in two datasets:
the sociometric dataset collected in our lab and the Reality
Mining dataset [2].
A. Hierarchical Dirichlet Process
Bluetooth data from each user is organized as a group of
data points, each is a count vector over all users. Specifically,
each data point generated for a user represents the number of
times he/she is in proximity with others during a time interval
(such as 10 minutes). The data points are modeled using
the multinomial distributions. We employ the hierarchical
Dirichlet process (HDP) to learn the latent patterns from these
data groups. The purpose is to exploit the shared statistical
strength between the groups and users. HDP uses the same set
of patterns to explain the data groups, also commonly known
as topics, and assigns a mixture proportion to each group.
Under this setting, each inferred multinomial pattern rep-
resents a typical proximal group. The users that have high
values in the same pattern are usually co-located with each
other. We interpret each pattern into a proximal group by ap-
plying a threshold on the multinomial vector, the users whose
values are greater than the threshold form the corresponding
group. Fig. 2 illustrates the proximal groups interpreted from
9 patterns learned by HDP on Reality Mining data. These
proximal groups are strongly correlated to the affiliation labels,
especially for the participants of Sloan and Masfrosh groups.
The inference for this model has been developed using the
Gibbs sampling and presented in [5], [8]. Next, we propose to
use the decayed MCMC method for incremental inference.
Fig. 2: Interpreted proximal groups learned by HDP on RealityMining data. Each numeric node represents a user with the user IDand others represent the groups. The color of the user nodes indicatesthe affiliation labels given in the dataset.
Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014
169
B. Indian Buffet Process
The main characteristic of clustering approach (such as
HDP) is the mutual exclusive usage of topics in each data
points, i.e., for each data point, an increase in the use of one
topic makes the others decrease. Thus, we propose to use the
factor analysis approach as a complement for proximal context
detection. We also explore the equivalence and difference
between the two approaches, clustering and factor analysis,
on this task.
For this, we collect all data points from all users to construct
an N × M matrix X , where N is the total number of data
points, and M is the total number of users. We model the
matrix X = [Xim] as:
Xim | Zi, Y ∼ Poisson
( ∞∑k=1
ZikYkm
)
where Y = [Ykm] is the factor matrix and Z = [Zik] is the
coefficient matrix. The key point of this model is that it has
an infinite number of factors specified by the Indian buffet
process [3] prior of Z. Given the observed data, the model
automatically infers the number of factors. The incremental
inference is performed using the fixed-lag particle filter [4]
that processes data in small batches. This inference method can
extract the proximal contexts incrementally when they emerge.
The factorial error obtained by the fixed-lag particle filter is
close to that of Gibbs sampling whilst the execution time is
more than 100 times faster.
Each extracted factor is an M -dimensional vector and can
be interpreted as a proximal group. If we normalize each factor
to 1, it is equivalent to the multinomial pattern extracted by the
HDP model. Therefore, we use the same manner to interpret
the factors into proximal groups.
IV. APPLICATIONS
The latent patterns and contexts extracted from social honest
signals play an important role in a wide range of applications
for which we shall briefly discuss a few.
Pervasive healthcare. Wearable devices can help monitor
personal health by detecting physical status and contexts.
Olguin and Pentland [6] suggest to use the sociometric badge
as a self-monitoring device that alerts early symptoms of de-
pression, or monitor daily activities of elderly. Our models can
help to continuously detect physical activity contexts, or data
aspects of sociometric badge such as speech or interaction.
Smart environment. A smart house can be built using wire-
less devices and wearable sensors to help independent living
of elderly or disabled people. The proposed framework can
also be used to revisit an existing problem in location-aware
applications. For example, it can be used to continuously learn
three useful contexts: motion state, location and movement
patterns over time and space. It is known that location often
co-occurs with certain activities or roles and aspects of users’
physical states as well as indicates users’ affordances such as
interruptibility. Leveraging these hidden contexts may provide
better approach to wide range of applications from automated
battery management to assistive systems (e.g., for the visual
impaired).
Organization management: pervasive systems including
wireless environmental and wearable sensors can be deployed
to construct sensible organizations as proposed in [6]. Those
systems can capture the behavior of people and their inter-
actions. Our models can help to detect the social contexts
of individuals and groups, improving the organization perfor-
mance. For example, the proximal groups described in section
III provide a clear understanding on groups of members that
usually interact with each other. This information can help to
change the structure of the organization, creating teams that
have good relationship and increasing the productivity of the
whole organization.
V. CONCLUSION
In this work, we propose to use Bayesian nonparametric
models as a rigorous framework for the extraction of rich and
hidden patterns and contexts from pervasive honest signals
to move beyond the basic derivations from raw signals. The
key advantage of this approach lies in its ability to discover
sophisticated and meaningful patterns, which can grow with
data, a crucial feature of any pervasive computing system. We
have demonstrated the use of two such models on activity
data and proximal data to extract physical activity contexts
and proximal contexts. The discovery of such contexts is
useful in various applications such as health care monitoring
or organization management.
ACKNOWLEDGEMENT
I am deeply grateful to my supervisors, Associate Professor
Dinh Phung, Professor Svetha Venkatesh and Dr. Sunil Gupta
for their instructions and assistance.
REFERENCES
[1] G.D. Abowd and E.D. Mynatt. Charting past, present, and future:Research in ubiquitous computing. ACM Transactions on Computer-Human Interaction, 7(1):29–58, 2000.
[2] N. Eagle and A. Pentland. Reality mining: Sensing complex socialsystems. Personal and Ubiquitous Computing, 10(4):255–268, 2006.
[3] T. Griffiths and Z. Ghahramani. Infinite latent feature models andthe Indian buffet process. Advances in Neural Information ProcessingSystems, 18:475, 2006.
[4] T. C. Nguyen, S. Gupta, S. Venkatesh, and D. Phung. Fixed-lagparticle filter for continuous context discovery using Indian buffetprocess. In IEEE International Conference on Pervasive Computingand Communications (PerCom) (accepted), 2014.
[5] T.C. Nguyen, D. Phung, S. Gupta, and S. Venkatesh. Extraction oflatent patterns and contexts from social honest signals using hierarchicalDirichlet processes. In PerCom, pages 47–55. IEEE, 2013.
[6] D.O. Olguín and A.S. Pentland. Sociometric badges: State of the art andfuture applications. In IEEE 11th International Symposium on WearableComputers (ISWC), 2007.
[7] D.O. Olguín and A.S. Pentland. Social sensors for automatic datacollection. In Americas Conference on Information Systems, page 171,2008.
[8] D. Phung, T.C. Nguyen, S. Gupta, and S. Venkatesh. Learning latentactivities from social signals with hierarchical Dirichlet process. InHandbook on Plan, Activity, and Intent Recognition (to appear). Else-vier, 2014.
[9] Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichletprocesses. J. Am. Statist. Assoc., 101(476):1566–1581, 2006.
[10] Mark Weiser. The computer for the 21st century. Scientific American,265(3):94–104, 1991.
Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014
170