privacy of location trajectory

57
Privacy of Location Trajectory Chi-Yin Chow Department of Computer Science City University of Hong Kong Mohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota

Upload: sharon-kelly

Post on 03-Jan-2016

58 views

Category:

Documents


3 download

DESCRIPTION

Privacy of Location Trajectory. Chi-Yin Chow Department of Computer Science City University of Hong Kong Mohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota. Outline. Introduction Protecting Trajectory Privacy in Location-based Services - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Privacy of Location Trajectory

Privacy of Location Trajectory

Chi-Yin ChowDepartment of Computer ScienceCity University of Hong Kong

Mohamed F. MokbelDepartment of Computer Science and EngineeringUniversity of Minnesota

Page 2: Privacy of Location Trajectory

Outline

• Introduction

• Protecting Trajectory Privacy in Location-based Services

• Protecting Privacy in Trajectory Publication

• Future Research Directions 2

Page 3: Privacy of Location Trajectory

Data Privacy

• Example: Hospitals want to publish medical records for public health research• Contain personal sensitive information• Natural way: remove known identifiers (de-identify)

GenderZi p Code

Date of Bi rth

Diagnosis...

Medical Records

3

Page 4: Privacy of Location Trajectory

Is De-identification Enough?

GenderZip Code

Date of Birth

Name...

Voter Registration Records

GenderZi p Code

Date of Bi rth

Diagnosis...

Medical Records

4

Page 5: Privacy of Location Trajectory

Is De-identification Enough?

GenderZip Code

Date of Birth

Name...

Diagnosis...

Quasi-identifiers

Voter Registration Records Medical Records

5

Page 6: Privacy of Location Trajectory

Data Privacy-Preserving Techniques

• k-anonymity (Sweeney, IJUFKS’02)

• Indistinguishable among at least k records

• l-diversity (Machanavajjhala et al., TKDD’07)

• At least l values for sensitive attributes

• t-closeness (Li et al., TKDE’10)

• Distribution of sensitive attributes (in equivalence class vs in entire data set)

6

Page 7: Privacy of Location Trajectory

Location Privacy

• Location-Based Services (LBS)• Untrustable LBS Service Provider – Location Privacy Leakage

7

Page 8: Privacy of Location Trajectory

Location Privacy-Preserving Techniques• False Location• Users generate fake locations

• Space Transformation• Transform into another space

• Spatial Cloaking• Blur user’s location into cloaked region

8

Page 9: Privacy of Location Trajectory

More Challenging: Trajectory Privacy• The hospital example• Suppose the trajectories of patients should be published

• Trajectory T:• De-identified

Sensitive Attribute

Suppose adversary know a patient visited (1, 5) and (8, 10) at timestamps 2 and 5, respectively

He has a disease of HIV! Powerful quasi-identifiers!

9

Page 10: Privacy of Location Trajectory

Two Kinds of Trajectory

• Real-time Trajectory -- Continuous LBS• “Continuously inform me the traffic condition within 1 mile from

my vehicle”• “Let me know my friends’ locations if they are within 2km from

my location”

• Off-line Trajectory -- Historical Trajectory• Publish trajectory data for public research • Answer spatio-temporal range queries

10

Page 11: Privacy of Location Trajectory

Continuous Location-based Services vs. Trajectory Publication

• Scalability Requirement• Continuous LBS: Real-time• Historical Trajectory: Off-line

• Applicability of Global Optimization • Continuous LBS: Dynamic, Uncertain• Historical Trajectory: Static

11

Page 12: Privacy of Location Trajectory

Outline

• Introduction

• Protecting Trajectory Privacy in Location-based Services

• Protecting Privacy in Trajectory Publication

• Future Research Directions 12

Page 13: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Category-I LBS: Require consistent user identities.• “Let me know my friends’ locations if they are within 2km from

my location”• Category-II LBS: Do not require consistent user identities. • “Send e-coupons to users within 1km from my coffee shop”

13

Page 14: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Spatial cloaking• Mix-zones• Vehicular mix-zones• Path confusion• Path confusion with mobility prediction and data

caching• Euler histogram-based on short IDs• Dummy trajectories

14

Page 15: Privacy of Location Trajectory

Spatial Cloaking

• Main Idea: Blur user’s location into cloaked region• k-anonymity

• Challenge: From snapshot location to continuous trajectory• Trajectory tracing attack• Anonymity-set tracing attack

• Support consistent user identity

15

Page 16: Privacy of Location Trajectory

Trajectory Tracing Attack (1/2)

Suppose R1 and R2 are two cloaked regions for user U at t1 and t2, respectively.

C AB

x

time

R1

R2

y

t1

t2

C

A

B

C AB

x

time

R1

y

t1

t2

Maximum bound

Suppose attacker knows U’s maximum speed.

16

Page 17: Privacy of Location Trajectory

Trajectory Tracing Attack (2/2)

Attacker could infer which user is U! (Here it is C)

C AB

x

time

R1

R2

y

t1

t2

C

A

BMaximum

bound

17

Page 18: Privacy of Location Trajectory

Trajectory Tracing Attack: Solution

C AB

x

time

R1

R2

y

t1

t2

C

A

BMaximum bound

C AB

x

time

R1

R2

y

C

A

BMaximum

bound

tn

t1

t2

Patching Technique Delaying Technique

(Cheng et al., PETS’06)18

Page 19: Privacy of Location Trajectory

Anonymity-set Tracing Attack

At time t1

F

G

HE

D

A

CB

x

y3-Anonymous Cloaked

Spatial Region

At time t2

F

G

H

E

D

A

C

B

x

y

19

Page 20: Privacy of Location Trajectory

Anonymity-set Tracing Attack: Solution

• Solution 1: Group-based Approach

• Solution 2: Distortion-based Approach

• Solution 3: Prediction-based Approach

20

Page 21: Privacy of Location Trajectory

Solution 1: Group-based Approach

F

G

HE

D

A

CB

x

y3-Anonymous Cloaked

Spatial Region F

G

H

E

D

A

C

B

x

y

F

G H

E

D

A

C

Bx

y

At time t1 At time t2 At time t3

• Group members are fixed• All members need to report their locations to the anonymizer server periodically

(Chow et al., SSTD’07) 21

Page 22: Privacy of Location Trajectory

Solution 2: Distortion-based Approach

• Do not need other members to report their locations periodically• Use their initial directions and velocities to calculate distortion regions• Use distortion regions as new cloaked regions

C

A

B

y

R1

(x+, y+)1 1

x

(x-, y-)1 1

C A

Bx

time

R1

Rn

Rn-1

y

t1

t2

tn-1

tn

R2

At time t1 At time ti

(Pan et al., SIGSPATIAL’09)

22

Page 23: Privacy of Location Trajectory

Solution 3: Prediction-based Approach• Predict user’s trajectory• Cloak it with other users’ historical trajectories

u1

u2

u3

C1 C2C3

C4

C5

Historical trajectoriesExpected trajectory

p1p2

p3p4

p5

(Xu et al., INFOCOM’08) 23

Page 24: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Spatial cloaking• Mix-zones• Vehicular mix-zones• Path confusion• Path confusion with mobility prediction and data

caching• Euler histogram-based on short IDs• Dummy trajectories

24

Page 25: Privacy of Location Trajectory

Mix-Zones (1/2)

• Main Idea: • Users change pseudonyms when entering mix-zones • Do not reveal their location when they are in mix-zones• k-anonymity

• Not support consistent user identity

25

Page 26: Privacy of Location Trajectory

Mix-Zones (2/2)

• Ensuring k-anonymity• At least k users in mix-zone at a certain time point• Each user spends a completely random duration of time in the mix-zone• Each user is equally likely to exit in any exit points no matter entering

through any entry points

Mix-Zone

a

b

c

x

y

z

(Freudiger et al., PETS’09)

26

Page 27: Privacy of Location Trajectory

Vehicular Mix-Zones (1/2)

• Mix-zone designed for Euclidean space not secure enough when it comes to vehicle movements• Physical roads• Vehicle directions• Speed limits• Traffic conditions• Road conditions

Mix-ZoneSeg1in

Seg1out Seg2in

Seg2out

Seg3in Seg3out

ab

c

d

27

Page 28: Privacy of Location Trajectory

Vehicular Mix-Zones (2/2)

• Adaptive mix-zones: • Road intersection, together with outgoing road segments

Seg1in

Seg1out Seg2in

Seg2out

Seg3in Seg3out

a

c

d

b

(Palanisamy et al., ICDE’11)

28

Page 29: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Spatial cloaking• Mix-zones• Vehicular mix-zones• Path confusion• Path confusion with mobility prediction and data

caching• Euler histogram-based on short IDs• Dummy trajectories

29

Page 30: Privacy of Location Trajectory

Path Confusion

• Goal: Avoid linking consecutive location samples to individual vehicles

• Main Idea: A central server controls the release of location data to satisfy “time-to-confusion”

• Not support consistent user identity

(Gruteser et al., MobiSys’03) 30

Page 31: Privacy of Location Trajectory

Path Confusion with Mobility Prediction and Data Caching• Main Idea: The location anonymizer predicts vehicular

movement paths, pre-fetches the spatial data on predicted paths, stores the data in a cache• Service provider can only see queries for a series of interweaving paths

Ua b c

d e f

The data on this path are cached

The data on this path are cached

Ua b c

d e f

Pre

dict

ed p

ath

?

?

(Meyerowitz et al., MobiCom’09)

31

Page 32: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Spatial cloaking• Mix-zones• Vehicular mix-zones• Path confusion• Path confusion with mobility prediction and data

caching• Euler histogram-based on short IDs• Dummy trajectories

32

Page 33: Privacy of Location Trajectory

Euler Histogram-based on Short IDs (EHSID)• Goal: Privacy-aware Traffic Monitoring (answering aggregate

queries of a given region)• ID-based query (count of unique vehicles) (need ID?)• Entry-based query (count of entries)

• Short ID: Partial ID information about objects• Full ID: 1 1 0 1 1 1 0 1 1• Bit Pattern: 1, 3, 4, 7• Short ID: 1 0 1 0

• Euler Histogram: Answer aggregate queries

• Not support consistent user identity(Xie et al., IEEE Trans. ITS’10)

33

Page 34: Privacy of Location Trajectory

Euler Histogram

Use an Euler histogram to count distinct rectangles in a query region R

• F is the sum of face counts inside R• V is the sum of vertex counts inside R (excluding its boundary)• E is the sum of edge counts inside R (excluding its boundary)

B

A C 1 2 3

1 2 2

1 2 2

1 2

1 2

1 2

1 2

1 2

1 2

1 2

2

2

Query region F = 1+2+1+2 = 6E = 1+1+1+2 = 5

= 6 + 1 – 5 = 2

V = 1

34

Page 35: Privacy of Location Trajectory

Euler Histogram-based on Short IDs (EHSID)• Answering four types of queries• ID-based cross-border• ID-based distinct-objects• Entry-based cross-border• Entry-based distinct-objects

• How to calculate these answers using Euler Histogram?

Query Region

V1

V2

Cross-border Distinct-object

1 2

2 3

ID-based

Entry-based

Query Answers

Que

ry T

ypes

Queries

35

Page 36: Privacy of Location Trajectory

Define Four Types of Vertices

Q

V01: 1

V

V01: 1

V01: 110: 1

V01: 110: 1

V01: 110: 1

(JO) (OB)

(JI) (CI)

E01: 1

E01: 1

E

E01: 110: 1

E01: 110: 1

ab

c d

e f

Query Region

Two TrajectoriesRoad Segment

36

Page 37: Privacy of Location Trajectory

Euler Histogram-based on Short IDs (EHSID)

Q

V01: 1

V

V01: 1

V01: 110: 1

V01: 110: 1

V01: 110: 1

(JO) (OB)

(JI) (CI)

E01: 1

E01: 1

E

E01: 110: 1

E01: 110: 1

ab

c d

e f

Query Region

Two TrajectoriesRoad Segment

37

Page 38: Privacy of Location Trajectory

Protecting Trajectory Privacy in LBS

• Spatial cloaking• Mix-zones• Vehicular mix-zones• Path confusion• Path confusion with mobility prediction and data

caching• Euler histogram-based on short IDs• Dummy trajectories

38

Page 39: Privacy of Location Trajectory

Dummy Trajectories

• Main Idea: User generate fake location trajectories• How to choose dummy trajectories?• How to measure the degree of privacy protection?

• Support consistent user identity

(You et al., PALMS’07) 39

Page 40: Privacy of Location Trajectory

How to Choose Dummy Trajectories• Snapshot disclosure (SD): Average probability of successfully inferring each

true location • Trajectory disclosure (TD): Probability of successfully identifying the true

trajectory among all possible trajectories• Distance deviation (DD): Average distance between the ith location samples

of real trajectory and each dummy trajectoryy

x1 2 3 4 5

1

2

3

4

s1

s2 s3

d2

d3

I1I2

0

Tr

d1

Td2Td1

40

Page 41: Privacy of Location Trajectory

Outline

• Introduction

• Protecting Trajectory Privacy in Location-based Services

• Protecting Privacy in Trajectory Publication

• Future Research Directions 41

Page 42: Privacy of Location Trajectory

Protecting Privacy in Trajectory Publication

• Clustering-based Anonymization Approach

• Generalization-based Anonymization Approach

• Suppression-based Anonymization Approach

• Grid-based Anonymization Approach42

Page 43: Privacy of Location Trajectory

Clustering-based Anonymization Approach• Main Idea: Group k co-localized trajectories within the

same time period to form a k-anonymized aggregate trajectory. • Trajectory Uncertainty Model

x

time

y

d

Trajectory

TrajectoryVolume

Uncertainty threshold

Horizontal Disk

(Abul et al., ICDE’08)43

Page 44: Privacy of Location Trajectory

Clustering-based Anonymization ApproachAggregate trajectory of a set of 2-anonymized co-localized trajectories

x

y

TrajectoryVolume of Tp

(radius=d)

TrajectoryVolume of Tq

(radius=d)

time

Bounding trajectory volume of Tp and Tq

(radius=d/2)Aggregate Trajectory

44

Page 45: Privacy of Location Trajectory

Protecting Privacy in Trajectory Publication

• Clustering-based Anonymization Approach

• Generalization-based Anonymization Approach

• Suppression-based Anonymization Approach

• Grid-based Anonymization Approach45

Page 46: Privacy of Location Trajectory

Generalization-based Anonymization Approach• Main Idea: • Step1: Generalize a trajectory data set into a

sequence of k-anonymized regions

• Step2: Uniformly select k atomic points from each anonymized region and reconstruct k trajectories

(Nergiz et al., TDP’09)46

Page 47: Privacy of Location Trajectory

47

Page 48: Privacy of Location Trajectory

48

Page 49: Privacy of Location Trajectory

Protecting Privacy in Trajectory Publication

• Clustering-based Anonymization Approach

• Generalization-based Anonymization Approach

• Suppression-based Anonymization Approach

• Grid-based Anonymization Approach49

Page 50: Privacy of Location Trajectory

Suppression-based Anonymization Approach• Main Idea: Iteratively suppress locations until the privacy

constraint is met• Privacy constraint• Difference between transformed trajectories and original ones

Suppress location a1(Terrovitis et al., MDM’08)

50

Page 51: Privacy of Location Trajectory

Suppression-based Anonymization ApproachThe probability adversary can identify the actual user of any location pi

Suppress location a1

51

Page 52: Privacy of Location Trajectory

Suppression-based Anonymization ApproachCalculate difference between transformed trajectory and the original

52

Page 53: Privacy of Location Trajectory

Suppression-based Anonymization Approach

53

Page 54: Privacy of Location Trajectory

Protecting Privacy in Trajectory Publication

• Clustering-based Anonymization Approach

• Generalization-based Anonymization Approach

• Suppression-based Anonymization Approach

• Grid-based Anonymization Approach54

Page 55: Privacy of Location Trajectory

Grid-based Anonymization Approach• Main Idea: Replace locations with grids (could have

different resolutions)

(Gidofalvi et al., MDM’07)

55

Page 56: Privacy of Location Trajectory

Outline

• Introduction

• Protecting Trajectory Privacy in Location-based Services

• Protecting Privacy in Trajectory Publication

• Future Research Directions 56

Page 57: Privacy of Location Trajectory

Future Directions

• Personalized LBS (require more user semantics)• User preferences and background information could be used as

quasi-identifiers

• Trajectory publication supporting more complex queries• Spatio-temporal queries• Spatio-temporal data analysis

57