measuring and predicting departures from routine in human mobility by dirk gorissen

46
Measuring and Predicting Departures from Routine in Human Mobility Dirk Gorissen | @elazungu PyData London - 23 February 2014

Upload: pydata

Post on 14-Jul-2015

259 views

Category:

Technology


0 download

TRANSCRIPT

Measuring and Predicting

Departures from Routine

in Human Mobility

Dirk Gorissen | @elazungu

PyData London - 23 February 2014

Me?

www.rse.ac.uk

Human Mobility - Credits

University of Southampton James McInerney

Sebastian Stein

Alex Rogers

Nick Jennings

BAE Systems ATC Dave Nicholson

Reference: J. McInerney, S. Stein, A. Rogers, and N. R. Jennings (2013).

Breaking the habit: measuring and predicting departures from routine in individual human mobility. Pervasive and Mobile Computing, 9, (6), 808-822.

Submitted KDD paper

Beijing Taxi rides

Nicholas Jing Yuan (Microsoft Research)

Human Mobility

London in Motion - Jay Gordon (MIT)

Human Mobility: Inference

Functional Regions of a city

Nicholas Jing Yuan (Microsoft Research)

Human Mobility: Inference

Jay Gordon (MIT)

Human Mobility: Inference

Cross cuts many fields: sociology, physics, network

theory, computer science, epidemiology, …

© PNAS

© MIT

Project InMind

Project InMind announced on 12 Feb

$10m Yahoo-CMU collaboration on predicting human needs and

intentions

Human Mobility

Human mobility is highly predictable

Average predictability in the next hour is 93% [Song 2010]

Distance little or no impact

High degree of spatial and temporal regularity

Spatial: centered around a small number of base locations

Temporal: e.g., workweek / weekend

“…we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.”

Temporal Regularity

[Herder 2012] [Song 2010]

Spatial Regularity

[Herder 2012] [Song 2010]

Breaking the Habit

However, regular patterns not the full story

travelling to another city on a weekend break or while on

sick leave

Breaks in regular patterns signal potentially

interesting events

Being in an unfamiliar place at an unfamiliar time

requires extra context aware assistance

E.g., higher demand for map & recommendation

apps, mobile advertising more relevant, …

Predict future departures from routine?

Applications

Optimize public transport

Insight into social behaviour

Spread of disease

(Predictive) Recommender systems

Based on user habits (e.g., Google Now, Sherpa)

Context aware advertising

Crime investigation

Urban planning

Obvious privacy & de-anonymization concerns

-> Eric Drass’ talk

Human Mobility: Inference

London riots “commute”

Modeling Mobility

Entropy measures typically used to determine regularity in fixed time slots

Well understood measures, wide applicability

Break down when considering prediction or higher level structure

Model based

Can consider different types of structure in mobility (i.e., sequential and temporal)

Can deal with heterogeneous data sources

Allows incorporation of domain knowledge (e.g., calendar information)

Can build extensions that deal with trust

Allows for prediction

Bayesian approach

distribution over locations

enables use as a generative model

Bayes Theorem

Bayesian Networks

Bottom up: Grass is wet, what is the most likely cause?

Top down: Its cloudy, what is the probability the grass is wet?

Hidden Markov Model

Simple Dynamic Bayesian Network

Shaded nodes are observed

Probabilistic Models

Model can be run forwards or backwards

Forwards (generation): parameters -> data

E.g., use a distribution

over word pair

frequencies to

generate sentences

Probabilistic Models

Model can be run backwards

Backwards (Inference): data -> parameters

Building the model

We want to model departures from routine

Assume assignment of a person to a hidden location

at all time steps (even when not observed)

Discrete latent locations

Correspond to “points of interest”

e.g., home, work, gym, train station, friend's house

Latent Locations

Augment with temporal structure

Temporal and periodic assumption to behaviour

e.g., tend to be home each night at 1am

e.g., often in shopping district on Sat afternoon

Add Sequential Structure

Added first-order Markov dynamics

e.g., usually go home after work

can extend to more complex sequential structures

Add Departure from Routine

zn = 0 : routine

zn = 1 : departure from routine

Sensors

Noisy sensors, e.g., cell tower observations

observed: latitude/longitude

inferred: variance (of locations)

Reported Variance

E.g., GPS

observed: latitude/longitude, variance

Trustworthiness

E.g., Eyewitness

observed: latitude/longitude, reported variance

inferred: trustworthiness of observation

single latent trust value(per time step & source)

Full Model

Inference

Inference is Challenging

Exact inference intractable

Can perform approximate inference using:

Expectation maximisation algorithm

Fast

But point estimates of parameters

Gibbs sampling, or other Markov chain Monte Carlo

Full distributions (converges to exact)

But slow

Variational approximation

Full distributions based on induced factorisation of model

And fast

Variational Approximation

Advantages

Straightforward parallelisation by user

Months of mobility data ~ hours

Updating previous day's parameters ~ minutes

Variational approximation amenable to fully online

inference

M. Hoffman, D. Blei, C. Wang, and J. Paisley.

Stochastic variational inference.

arXiv:1206.7051, 2012

Model enables

Inference

location

departures from routine

noise characteristics of observations

trust characteristics of sensors

Exploration/summarisation

parameters have intuitive interpretations

Prediction

Future mobility (given time context)

Future departures from routine

Performance

Nokia Dataset (GPS only) [McInerney 2012]

Performance

Performance

Synthetic dataset with heterogeneous, untrustworthy

observations.

Parameters of generating model learned from OpenPaths

dataset

Performance

Implementation

Backend inference and data processing code all python

numpy

scipy

matplotlib

UI to explore model predictions & sanity check

flask

d3.js

leaflet.js

kockout.js

Future

Gensim, pymc, bayespy, …

Probabilistic programming

Map View: Observed

Map View: Inferred

Departures from Routine: Temporal

Departures from Routine: Spatial

Departures from Routine: Combined

Departures from Routine

Conclusion & Future Work

Summary

Novel model for learning and predicting departures from routine

Limitations

Need better ground truth for validation

Finding ways to make the model explain why each departure from routine happened.

Needs more data (e.g., from people who know each other, using weather data, app usage data, …).

Future Work

Incorporating more advanced sequential structure into the model

e.g., hidden semi-Markov model, sequence memoizer

Supervised learning of what “interesting" mobility looks like

More data sources

Online inference

Taxi drivers

Questions?

Thank you.

[email protected] | @elazungu

Reference: J. McInerney, S. Stein, A. Rogers, and N. R. Jennings (2013).

Breaking the habit: measuring and predicting departures from routine in individual human mobility. Pervasive and Mobile Computing, 9, (6), 808-822.