[ieee 2014 ieee international conference on pervasive computing and communication workshops (percom...

3
Bayesian Nonparametric Extraction of Hidden Contexts from Pervasive Honest Signals Thuong Nguyen Centre for Pattern Recognition and Data Analytics, Deakin University, Australia Email: [email protected] Abstract—Hidden patterns and contexts play an important part in intelligent pervasive systems. Most of the existing works have focused on simple forms of contexts derived directly from raw signals. High-level constructs and patterns have been largely neglected or remained under-explored in pervasive computing, mainly due to the growing complexity over time and the lack of efficient principal methods to extract them. Traditional parametric modeling approaches from machine learning find it difficult to discover new, unseen patterns and contexts arising from continuous growth of data streams due to its practice of training-then-prediction paradigm. In this work, we propose to apply Bayesian nonparametric models as a systematic and rigorous paradigm to continuously learn hidden patterns and contexts from raw social signals to provide basic building blocks for context-aware applications. Bayesian nonparametric models allow the model complexity to grow with data, fitting naturally to several problems encountered in pervasive computing. Under this framework, we use nonparametric prior distributions to model the data generative process, which helps towards learning the number of latent patterns automatically, adapting to changes in data and discovering never-seen-before patterns, contexts and activities. The proposed methods are agnostic to data types, however our work shall demonstrate to two types of signals: accelerometer activity data and Bluetooth proximal data. I. I NTRODUCTION The development of smartphone and wearable device tech- nology has deeply changed our lives. To an extent, it realizes the vision about pervasive computing of Mark Weiser [10]. One important part of Weiser’s vision is that the system can act and respond adaptively to data changes, which are often due to the context alteration. Therefore, the extraction of contexts plays an important role in pervasive systems. Simple and low level contexts such as location and physical activity can be extracted directly from sensor data, collected by GPS, WiFi, Bluetooth or accelerometer. However, these contexts might be susceptible to changes without additional interpretation and careful data cleaning. Furthermore, there are richer, but hidden patterns in the data that are important for the operation of such systems. These hidden patterns, once extracted, provide a high level of contexts that help the system behave intelligently. Examples include human interaction patterns, activity of daily routines, and proximity of people. However, deriving high-level contexts that are hidden in the data is a challenging task. Typical methods dominantly used fall under unsupervised learning – clustering and factor analysis techniques such as Gaussian mixture models (GMM), K-means, hidden Markov models (HMM), latent Dirichlet al- location (LDA), principal component analysis (PCA), and ma- trix factorization. These models are parametric, once learned they are fixed and unable to grow or expand with changes in data. These parametric methods might also not be suitable for pervasive applications as the data is typically growing over time with “no clear beginning and ending” [1]. To address these problems, this work explores the use of Bayesian nonparametric models as a systematic and rigorous paradigm to continuously learn hidden patterns and contexts from raw social signals. The key feature of Bayesian non- parametric models lies in their ability to expand the model complexity to grow with the data; the number of latent patterns can be learned automatically to adapt to changes in data and discover unseen patterns. We will discuss how this modeling framework can be applied to a wide range of applications as well as provide initial results by analyzing the popular Reality Mining dataset and a dataset collected in our own lab. We apply the models on two types of data: accelerometer-based activity data and Bluetooth-based proximal data. II. DETECTION OF PHYSICAL ACTIVITY LEVEL USING HIERARCHICAL DIRICHLET PROCESS Physical activity has a high impact on human’s health. The sedentary habit is one of the causes of increasing trends of various diseases such as obesity, diabetes, high blood pressure, and cardiovascular disease. Detection of physical activity level can help to improve the health care monitoring as well as to change the life-style to reduce the risks of these diseases. One of the tasks of this work is exploring how to detect the physical activity level in real-life setting. To address this, we collect a new dataset using the sociometric badge [7] during working hours in three weeks. The dataset is collected from 11 participants who are members of our lab. We use the consis- tency feature as the physical activity indicator. The consistency is defined as 1 minus the standard deviation of x 2 + y 2 + z 2 during each second interval, where x, y, z are the three ac- celeration values sensed by the triaxial accelerometer. The consistency ranges from 0 to 1, where 1 indicates no changes in activity, and 0 indicates the maximum amount of variation in activity. Fig. 1 shows some examples of consistency values collected during various activities. The consistency values of sedentary activity, e.g. sitting, are generally close to 1, whilst those of the intenser activities, e.g. walking or playing table tennis, are closer to 0. The ground truth of the activities is Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014 978-1-4799-2736-4/14/$31.00 ©2014 IEEE 168

Upload: thuong

Post on 16-Feb-2017

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: [IEEE 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS) - Budapest, Hungary (2014.03.24-2014.03.28)] 2014 IEEE International

Bayesian Nonparametric Extraction of HiddenContexts from Pervasive Honest Signals

Thuong Nguyen

Centre for Pattern Recognition and Data Analytics, Deakin University, Australia

Email: [email protected]

Abstract—Hidden patterns and contexts play an importantpart in intelligent pervasive systems. Most of the existing workshave focused on simple forms of contexts derived directly fromraw signals. High-level constructs and patterns have been largelyneglected or remained under-explored in pervasive computing,mainly due to the growing complexity over time and thelack of efficient principal methods to extract them. Traditionalparametric modeling approaches from machine learning find itdifficult to discover new, unseen patterns and contexts arisingfrom continuous growth of data streams due to its practiceof training-then-prediction paradigm. In this work, we proposeto apply Bayesian nonparametric models as a systematic andrigorous paradigm to continuously learn hidden patterns andcontexts from raw social signals to provide basic building blocksfor context-aware applications. Bayesian nonparametric modelsallow the model complexity to grow with data, fitting naturally toseveral problems encountered in pervasive computing. Under thisframework, we use nonparametric prior distributions to modelthe data generative process, which helps towards learning thenumber of latent patterns automatically, adapting to changes indata and discovering never-seen-before patterns, contexts andactivities. The proposed methods are agnostic to data types,however our work shall demonstrate to two types of signals:accelerometer activity data and Bluetooth proximal data.

I. INTRODUCTION

The development of smartphone and wearable device tech-

nology has deeply changed our lives. To an extent, it realizes

the vision about pervasive computing of Mark Weiser [10].

One important part of Weiser’s vision is that the system can act

and respond adaptively to data changes, which are often due

to the context alteration. Therefore, the extraction of contexts

plays an important role in pervasive systems. Simple and low

level contexts such as location and physical activity can be

extracted directly from sensor data, collected by GPS, WiFi,

Bluetooth or accelerometer. However, these contexts might be

susceptible to changes without additional interpretation and

careful data cleaning. Furthermore, there are richer, but hidden

patterns in the data that are important for the operation of

such systems. These hidden patterns, once extracted, provide a

high level of contexts that help the system behave intelligently.

Examples include human interaction patterns, activity of daily

routines, and proximity of people.

However, deriving high-level contexts that are hidden in

the data is a challenging task. Typical methods dominantly

used fall under unsupervised learning – clustering and factor

analysis techniques such as Gaussian mixture models (GMM),

K-means, hidden Markov models (HMM), latent Dirichlet al-

location (LDA), principal component analysis (PCA), and ma-

trix factorization. These models are parametric, once learned

they are fixed and unable to grow or expand with changes in

data. These parametric methods might also not be suitable for

pervasive applications as the data is typically growing over

time with “no clear beginning and ending” [1].

To address these problems, this work explores the use of

Bayesian nonparametric models as a systematic and rigorous

paradigm to continuously learn hidden patterns and contexts

from raw social signals. The key feature of Bayesian non-

parametric models lies in their ability to expand the model

complexity to grow with the data; the number of latent patterns

can be learned automatically to adapt to changes in data and

discover unseen patterns. We will discuss how this modeling

framework can be applied to a wide range of applications as

well as provide initial results by analyzing the popular Reality

Mining dataset and a dataset collected in our own lab. We

apply the models on two types of data: accelerometer-based

activity data and Bluetooth-based proximal data.

II. DETECTION OF PHYSICAL ACTIVITY LEVEL USING

HIERARCHICAL DIRICHLET PROCESS

Physical activity has a high impact on human’s health. The

sedentary habit is one of the causes of increasing trends of

various diseases such as obesity, diabetes, high blood pressure,

and cardiovascular disease. Detection of physical activity level

can help to improve the health care monitoring as well as to

change the life-style to reduce the risks of these diseases.

One of the tasks of this work is exploring how to detect the

physical activity level in real-life setting. To address this, we

collect a new dataset using the sociometric badge [7] during

working hours in three weeks. The dataset is collected from 11participants who are members of our lab. We use the consis-tency feature as the physical activity indicator. The consistencyis defined as 1 minus the standard deviation of

√x2 + y2 + z2

during each second interval, where x, y, z are the three ac-

celeration values sensed by the triaxial accelerometer. The

consistency ranges from 0 to 1, where 1 indicates no changes

in activity, and 0 indicates the maximum amount of variation

in activity. Fig. 1 shows some examples of consistency values

collected during various activities. The consistency values of

sedentary activity, e.g. sitting, are generally close to 1, whilst

those of the intenser activities, e.g. walking or playing table

tennis, are closer to 0. The ground truth of the activities is

Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014

978-1-4799-2736-4/14/$31.00 ©2014 IEEE 168

Page 2: [IEEE 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS) - Budapest, Hungary (2014.03.24-2014.03.28)] 2014 IEEE International

0 200 400 600

1

2

3

4

Time (seconds)0 200 400 600

1

2

3

4

Time (seconds)

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

0 200 400 6000.4

0.5

0.6

0.7

0.8

0.9

1

1.1Sitting

0 200 400 6000.4

0.5

0.6

0.7

0.8

0.9

1

1.1Walking

0 200 400 6000.4

0.5

0.6

0.7

0.8

0.9

1

1.1Playing table tennis

Level 1

Level 2

Level 3

Level 4

0 200 400 600

1

2

3

4

Time (seconds)

Physical activity levels Consistency values

Inferred labels

Fig. 1: Physical activity levels detected by HDP (left) and the inferredlabels for three activities: sitting, walking, and playing table tennis.The colors of the inferred labels are the same as those of the physicalactivity components.

provided by the participants through the Magpi application1

on mobile phones.

We organize each activity sample as a group of consistency

values and model the distribution of consistency as a mixture

of normal distributions. We employ the hierarchical Dirichlet

process (HDP) [5], [9] to detect the physical activity levels.

This model can infer the number of levels automatically given

the observed data instead of specifying it in advance. Fig.

1 illustrates the normal distribution components inferred by

HDP, each component can be interpreted as a physical activity

level. These levels are highly correlated to the movement

intensity. The mixture proportions of each activity can be used

as features for other tasks such as activity recognition.

III. CONTINUOUS EXTRACTION OF PROXIMAL CONTEXTS

Understanding human interaction is important in many

applications. It can help to change the organizational structure,

keep track of the infectious disease spreading or monitoring

the interaction-related mental health problems, to name a few.

Bluetooth signals that yield information about other proxi-

mal Bluetooth devices give an opportunity to study the inter-

action in long term. In a community of people instrumented

with Bluetooth devices, proximal groups can be extracted

using graph based approaches such as K-clique or the Louvain

method. These approaches can provide only the final groupings

of the whole data, without information about user membership

over time. Thus, they are limited in stream data analysis.

Alternatives from machine learning, e.g. principal component

analysis (PCA) or latent Dirichlet allocation (LDA) can pro-

vide both groupings and user membership, but it requires

the number of groups to be pre-specified. This parameter is

difficult to identify in many applications. Moreover, it may

change dynamically in stream data and needs to be estimated

automatically and incrementally.

To address this problem, we propose to use two models fol-

lowing two different approaches: hierarchical Dirichlet process

(HDP) for clustering approach and Indian buffet process (IBP)

for factor analysis approach. We also propose the incremental

1https://www.magpi.com/

inference for these models to learn the context online. We

demonstrate the models using Bluetooth data in two datasets:

the sociometric dataset collected in our lab and the Reality

Mining dataset [2].

A. Hierarchical Dirichlet Process

Bluetooth data from each user is organized as a group of

data points, each is a count vector over all users. Specifically,

each data point generated for a user represents the number of

times he/she is in proximity with others during a time interval

(such as 10 minutes). The data points are modeled using

the multinomial distributions. We employ the hierarchical

Dirichlet process (HDP) to learn the latent patterns from these

data groups. The purpose is to exploit the shared statistical

strength between the groups and users. HDP uses the same set

of patterns to explain the data groups, also commonly known

as topics, and assigns a mixture proportion to each group.

Under this setting, each inferred multinomial pattern rep-

resents a typical proximal group. The users that have high

values in the same pattern are usually co-located with each

other. We interpret each pattern into a proximal group by ap-

plying a threshold on the multinomial vector, the users whose

values are greater than the threshold form the corresponding

group. Fig. 2 illustrates the proximal groups interpreted from

9 patterns learned by HDP on Reality Mining data. These

proximal groups are strongly correlated to the affiliation labels,

especially for the participants of Sloan and Masfrosh groups.

The inference for this model has been developed using the

Gibbs sampling and presented in [5], [8]. Next, we propose to

use the decayed MCMC method for incremental inference.

Fig. 2: Interpreted proximal groups learned by HDP on RealityMining data. Each numeric node represents a user with the user IDand others represent the groups. The color of the user nodes indicatesthe affiliation labels given in the dataset.

Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014

169

Page 3: [IEEE 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS) - Budapest, Hungary (2014.03.24-2014.03.28)] 2014 IEEE International

B. Indian Buffet Process

The main characteristic of clustering approach (such as

HDP) is the mutual exclusive usage of topics in each data

points, i.e., for each data point, an increase in the use of one

topic makes the others decrease. Thus, we propose to use the

factor analysis approach as a complement for proximal context

detection. We also explore the equivalence and difference

between the two approaches, clustering and factor analysis,

on this task.

For this, we collect all data points from all users to construct

an N × M matrix X , where N is the total number of data

points, and M is the total number of users. We model the

matrix X = [Xim] as:

Xim | Zi, Y ∼ Poisson

( ∞∑k=1

ZikYkm

)

where Y = [Ykm] is the factor matrix and Z = [Zik] is the

coefficient matrix. The key point of this model is that it has

an infinite number of factors specified by the Indian buffet

process [3] prior of Z. Given the observed data, the model

automatically infers the number of factors. The incremental

inference is performed using the fixed-lag particle filter [4]

that processes data in small batches. This inference method can

extract the proximal contexts incrementally when they emerge.

The factorial error obtained by the fixed-lag particle filter is

close to that of Gibbs sampling whilst the execution time is

more than 100 times faster.

Each extracted factor is an M -dimensional vector and can

be interpreted as a proximal group. If we normalize each factor

to 1, it is equivalent to the multinomial pattern extracted by the

HDP model. Therefore, we use the same manner to interpret

the factors into proximal groups.

IV. APPLICATIONS

The latent patterns and contexts extracted from social honest

signals play an important role in a wide range of applications

for which we shall briefly discuss a few.

Pervasive healthcare. Wearable devices can help monitor

personal health by detecting physical status and contexts.

Olguin and Pentland [6] suggest to use the sociometric badge

as a self-monitoring device that alerts early symptoms of de-

pression, or monitor daily activities of elderly. Our models can

help to continuously detect physical activity contexts, or data

aspects of sociometric badge such as speech or interaction.

Smart environment. A smart house can be built using wire-

less devices and wearable sensors to help independent living

of elderly or disabled people. The proposed framework can

also be used to revisit an existing problem in location-aware

applications. For example, it can be used to continuously learn

three useful contexts: motion state, location and movement

patterns over time and space. It is known that location often

co-occurs with certain activities or roles and aspects of users’

physical states as well as indicates users’ affordances such as

interruptibility. Leveraging these hidden contexts may provide

better approach to wide range of applications from automated

battery management to assistive systems (e.g., for the visual

impaired).

Organization management: pervasive systems including

wireless environmental and wearable sensors can be deployed

to construct sensible organizations as proposed in [6]. Those

systems can capture the behavior of people and their inter-

actions. Our models can help to detect the social contexts

of individuals and groups, improving the organization perfor-

mance. For example, the proximal groups described in section

III provide a clear understanding on groups of members that

usually interact with each other. This information can help to

change the structure of the organization, creating teams that

have good relationship and increasing the productivity of the

whole organization.

V. CONCLUSION

In this work, we propose to use Bayesian nonparametric

models as a rigorous framework for the extraction of rich and

hidden patterns and contexts from pervasive honest signals

to move beyond the basic derivations from raw signals. The

key advantage of this approach lies in its ability to discover

sophisticated and meaningful patterns, which can grow with

data, a crucial feature of any pervasive computing system. We

have demonstrated the use of two such models on activity

data and proximal data to extract physical activity contexts

and proximal contexts. The discovery of such contexts is

useful in various applications such as health care monitoring

or organization management.

ACKNOWLEDGEMENT

I am deeply grateful to my supervisors, Associate Professor

Dinh Phung, Professor Svetha Venkatesh and Dr. Sunil Gupta

for their instructions and assistance.

REFERENCES

[1] G.D. Abowd and E.D. Mynatt. Charting past, present, and future:Research in ubiquitous computing. ACM Transactions on Computer-Human Interaction, 7(1):29–58, 2000.

[2] N. Eagle and A. Pentland. Reality mining: Sensing complex socialsystems. Personal and Ubiquitous Computing, 10(4):255–268, 2006.

[3] T. Griffiths and Z. Ghahramani. Infinite latent feature models andthe Indian buffet process. Advances in Neural Information ProcessingSystems, 18:475, 2006.

[4] T. C. Nguyen, S. Gupta, S. Venkatesh, and D. Phung. Fixed-lagparticle filter for continuous context discovery using Indian buffetprocess. In IEEE International Conference on Pervasive Computingand Communications (PerCom) (accepted), 2014.

[5] T.C. Nguyen, D. Phung, S. Gupta, and S. Venkatesh. Extraction oflatent patterns and contexts from social honest signals using hierarchicalDirichlet processes. In PerCom, pages 47–55. IEEE, 2013.

[6] D.O. Olguín and A.S. Pentland. Sociometric badges: State of the art andfuture applications. In IEEE 11th International Symposium on WearableComputers (ISWC), 2007.

[7] D.O. Olguín and A.S. Pentland. Social sensors for automatic datacollection. In Americas Conference on Information Systems, page 171,2008.

[8] D. Phung, T.C. Nguyen, S. Gupta, and S. Venkatesh. Learning latentactivities from social signals with hierarchical Dirichlet process. InHandbook on Plan, Activity, and Intent Recognition (to appear). Else-vier, 2014.

[9] Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichletprocesses. J. Am. Statist. Assoc., 101(476):1566–1581, 2006.

[10] Mark Weiser. The computer for the 21st century. Scientific American,265(3):94–104, 1991.

Seventh Annual PhD Forum on Pervasive Computing and Communications, 2014

170