deep learning for privacy and utility preserving sensor

50
Part 3: Deep Learning for Privacy and Utility Preserving Sensor Data Transformations Mohammad Malekzadeh PhD Student in CS at QMUL [email protected] 1 A tutorial on Deep Learning for Privacy in Multimedia ACM Multimedia 2020

Upload: others

Post on 23-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning for Privacy and Utility Preserving Sensor

Part 3:

Deep Learning for Privacy and Utility Preserving

Sensor Data Transformations

Mohammad MalekzadehPhD Student in CS at QMUL

[email protected]

1

A tutorial on Deep Learning for Privacy in Multimedia

ACM Multimedia 2020

Page 2: Deep Learning for Privacy and Utility Preserving Sensor

Outline

1. Motivations (Mobile & Wearable Sensors)

2. Problem Definition (User’s Privacy & Data Utility)

3. How to Protect Users’ Sensitive

I. Activities

II. Attributes

III. Activities & Attributes

4. Conclusion and Open Questions

5. Q & A (Sharing the Code Examples)

2

Page 3: Deep Learning for Privacy and Utility Preserving Sensor

1. Motivation

3

Page 4: Deep Learning for Privacy and Utility Preserving Sensor

Mobile and Wearables

• GPS Location

• Microphone

• Accelerometer

• Gyroscope

• Magnetometer

• Barometer

• Thermometer

• Proximity

• Ambient Light

• Heart Rate

• …

• BTS Location

• Microphone

1970 2020

4

Motion

Sensors

*Rodrigues, J. J., et. al. (2018). Enabling technologies for the internet of health things. IEEE Access, 6, 13129-13141.

Several types of wearable technology*

Page 5: Deep Learning for Privacy and Utility Preserving Sensor

Applications and Threats

• Applications:

– Health and Wellness,

– Patient and Elderly Monitoring,

– Gaming and VR, etc.

• Privacy Threats:

– Revealing sensitive activities:

– Leaking sensitive attributes

– The re-identification of the user

– Pin Code Inference, Targeted advertising, etc.

5

Page 6: Deep Learning for Privacy and Utility Preserving Sensor

Motion Sensors

x

-x

z

-z

y

-y

6

walking

Accelerometer

Gyroscope

Page 7: Deep Learning for Privacy and Utility Preserving Sensor

Activity Recognition

7

Data of a smartwatch worn on the right wrist of the user.

Page 8: Deep Learning for Privacy and Utility Preserving Sensor

2. Problem Definition

8

Page 9: Deep Learning for Privacy and Utility Preserving Sensor

Window-Based Classification

9

• 3 sensors

• 2 seconds’ time-window

• Stride length : ½ seconds

• Sampling rate: 50 Hz

Dimensions of X --> 100 x 9

Example:

Page 10: Deep Learning for Privacy and Utility Preserving Sensor

Perfect Privacy but Weak Utility

II. Edge-Based

I. Cloud-Based

Weak Privacy but Perfect Utility

III. Hybrid (edge and cloud)

Good Privacy and Good Utility

10

Three Approaches to Classification

Page 11: Deep Learning for Privacy and Utility Preserving Sensor

• Filtering

– To avoid releasing the original data if it includes sensitive information.

• Noise Addition

– independent or correlated noise to the original data.

• Transformation

– To generate a transformed version of the original data that:

• is still informative about the required task.

and

• is invariant to the user’s sensitive attribute.

Privacy-Preserving Mechanisms

11

Page 12: Deep Learning for Privacy and Utility Preserving Sensor

• Considering:

– 𝑿: the user’s data

– 𝓜: the desired transformation mechanism

• The aim is to release ሶ𝑿 = 𝓜(𝑿) such that:

– Required data, 𝐫, can be inferred from ሶ𝑿,

as accurate as possible to what one could have inferred if we would have released 𝑿.

– No information about the sensitive data, 𝐬, can be inferred from ሶ𝑿,

ideally, one cannot have a better guess than the random guess on the possible values

Utility and Privacy Preserving Data Transformation

12

Page 13: Deep Learning for Privacy and Utility Preserving Sensor

The Motivated Setting

13

required

sensitive

neutrals

For Training

Stage

Utility:

Privacy:

Metricsargmaxr P(r | ሶ𝑿) = True r ?

| P(s | ሶ𝑿) − Random Guess on s | ?

ሶ𝑿ሶ𝑿

ሶ𝑿

Page 14: Deep Learning for Privacy and Utility Preserving Sensor

3.I. How to Protect User’s Sensitive

Activities

14

Page 15: Deep Learning for Privacy and Utility Preserving Sensor

• Required: activities which the user

gains utility from sharing with the app.

• Sensitive: activities which the user

wish to keep private and should not be

revealed to the app.

• Neutral: activities that are not

sensitive to the user that these

activities can be recognized by the

server and it is also not useful in

gaining utility from the server.

Three types of activities

15

As an Example:

a step counter application

Page 16: Deep Learning for Privacy and Utility Preserving Sensor

Replacement Approach

16

required

Page 17: Deep Learning for Privacy and Utility Preserving Sensor

Pairing Datasets for Training

17

Page 18: Deep Learning for Privacy and Utility Preserving Sensor

Datasets

• Activity Recognition

18

Page 19: Deep Learning for Privacy and Utility Preserving Sensor

Evaluation Setting

19

• RAE : A 7-layers Deep Autoencoder

• Activity Recognizer: A Deep Convolutional Autoencoder

One of the state-of-the-art for activity recognition using sensor data*

sliding window ( ! )

se

lect

ed s

enso

rs (

" )

stepsize(# )

$%&→)%&

*→

)%&

+→

)%&

, -→)%&

+→)%&

*→ . / 0

Deep CNN

[state-of-the-art]

Cla

ssif

icat

ion

( 1

)

Third Party Model Replacement Autoencoder User’s Time-Series

d

*J. Yang, et. al., “Deep convolutional neural networks on multichannel time series for human activity recognition.” in IJCAI, 2015, pp. 3995–4001.

Page 20: Deep Learning for Privacy and Utility Preserving Sensor

Experimental Result

20

UTwente Dataset: Complex Human Activities Dataset*

* https://www.utwente.nl/en/eemcs/ps/research/dataset/

Classifier’s Accuracy

on original data 𝑿 → on transformed data ሶ𝑿

Page 21: Deep Learning for Privacy and Utility Preserving Sensor

Smoking Sub-Activities

21

Accelerometer Data

time

Page 22: Deep Learning for Privacy and Utility Preserving Sensor

Experimental Result

Classifier’s accuracy on the

• 𝑿 : original data

• ሶ𝑿 : transformed data

22

Skoda Dataset*

* http://har-dataset.org/doku.php?id=wiki:dataset

ሶ𝑿𝑿

Confusion Matrix

Page 23: Deep Learning for Privacy and Utility Preserving Sensor

A Potential Attack

• Using a Deep Convolutional Generative Adversarial Net. (DC-GAN)

23

: Original Data

: Transformed Data

Assuming that the adversary

have access to a dataset of the

target user

Page 24: Deep Learning for Privacy and Utility Preserving Sensor

Code

https://github.com/mmalekzadeh/replacement-autoencoder

24

Page 25: Deep Learning for Privacy and Utility Preserving Sensor

25

3.ii How to Protect User’s Sensitive

Attributes

Page 26: Deep Learning for Privacy and Utility Preserving Sensor

• Privacy is not only about sensitive activates.

• Information that might be discovered from non-sensitive activities:

– gender

– race

– weight

• Re-identification

– to figure out whether the observed data belongs to a specific person or not,

– for example, by taking advantage of some data collected through other channels.

Sensor Data Anonymization

26

Page 27: Deep Learning for Privacy and Utility Preserving Sensor

An experiment

27

The campus of Queen Mary University of London

• iPhone 6 in the front pocket of

participants’ trousers.

• 2 Sensors:

Device Acceleration

(Accelerometer)

Device Rotation (Gyroscope)

• 24 Subjects:

Gender, Age, Weight, Height

• 6 Activities:

Walking, Jogging, Downstairs,

Upstairs

Sat, Stand-up

Page 28: Deep Learning for Privacy and Utility Preserving Sensor

Correlation

28

walking

Page 29: Deep Learning for Privacy and Utility Preserving Sensor

User-Specific Info.

• 2D visualization Using t-SNE*

– Jogging Activity

– 2.5 seconds Time Window

– 24 Users

29

*Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(Nov), 2579-2605.

Page 30: Deep Learning for Privacy and Utility Preserving Sensor

Single and Multivariate Data

30

Processing magnitude values for both

sensors using 2D convolutional filters

Re-I

den

tifica

tio

n A

ccu

racy (

%)

= x2 + y2 + z2magnitude

Page 31: Deep Learning for Privacy and Utility Preserving Sensor

The Effect of Reducing the Granularity

31

A window of 2.5 sec. sensor data is processed by a ConvNet* to recognize

users’ activity and to re-identify the user

*J. Yang, et. al., “Deep convolutional neural networks on multichannel time series for human activity recognition.” in IJCAI, 2015, pp. 3995–4001.

Page 32: Deep Learning for Privacy and Utility Preserving Sensor

Informational Privacy

32

𝐫 ∶ required data

𝐬 ∶ sensitive data

I ∶ mutual information

𝛿 ∶ allowed distortion

ሶ𝑿 ሶ𝑿𝑿𝓜:𝑿 → ሶ𝑿

Page 33: Deep Learning for Privacy and Utility Preserving Sensor

Anonymizing Approach

Dev

ice’s

Sen

sors

{ 𝐱1

𝐱𝟐

𝐱3

AAE

app

Required Info.

X Sensitive Info.

{ 𝐱4

𝐱5

𝐱6

𝑐𝑣 𝑥

{

𝐲1

𝐲𝟐

𝐲3

{

𝐲4

𝐲5

𝐲6

𝑐𝑣 𝑥

33

Page 34: Deep Learning for Privacy and Utility Preserving Sensor

Multi-Objective Loss Function

34

MSECategoricalCross Entropy

customized

ሶ𝑿

Page 35: Deep Learning for Privacy and Utility Preserving Sensor

Intuition

35

Predicted ො𝐬 1 - ො𝐬 𝓛𝒔

[.20 .20 .20 .20 .20] [.80 .80 .80 .80 .80] 1.12

[.12 .12 .12 .12 .50] [.88 .88 .88 .88 .50] 1.23

[.01 .09 .10 .30 .50] [.99 .91 .90 .70 .50] 1.26

[.01 .01 .09 .09 .80] [.99 .99 .91 .91 .20] 1.82

[.01 .01 .01 .01 .96] [.99 .99 .99 .99 .04] 3.04

e.g. True 𝐬 = 𝟓

Page 36: Deep Learning for Privacy and Utility Preserving Sensor

Datasets

36

Page 37: Deep Learning for Privacy and Utility Preserving Sensor

Experimental Results

Raw DataOur AAE

Transformation

Censoring

Representations[1]

Singular Spectrum

Analysis

Resampling

by FFT

Activity Recognition Accuracy (%) : Using ConvNets

92.5 ± 2.0 92.9 ± 0.3 ~ 91.5 ± 0.9 ~ 87.4 ± 0.9 88 ± 1.8

Re-Identification of Users Accuracy (%) : Using ConvNets

96.2 7.0 15.9 16.1 13.5

Data Similarity Rank : Using Dynamic Time Warping

0 6.6 10.7 9.5 9.3

37

Fidelity

Privacy

Utility

Motion Sense Dataset

Page 38: Deep Learning for Privacy and Utility Preserving Sensor

Visualization: t-SNE

38

Raw Transformed

Activities Users Activities Users

Page 39: Deep Learning for Privacy and Utility Preserving Sensor

Time Domain

39

Gyroscope

Accelerometer

Page 40: Deep Learning for Privacy and Utility Preserving Sensor

Spectrogram

40

Raw

Tra

nsfo

rmed

Gyroscope

New periodic components are introduced in the data and some of the original ones are obscured.

Accelerometer

Page 41: Deep Learning for Privacy and Utility Preserving Sensor

Code

https://github.com/mmalekzadeh/motion-sense

41

Page 42: Deep Learning for Privacy and Utility Preserving Sensor

42

3.iii How to Protect User’s Sensitive

Activities and Attributes

Page 43: Deep Learning for Privacy and Utility Preserving Sensor

A Compound Solution

43

… …

� � � �� � � � � � � �� � � � � � � ���� �

Raw Time-Window

… …

� � ′ � ′′

Sen

sors

Ap

p Buffering

a Time-Widow

Transformation by � � � � � � �� �� � � � �� � � � � � � �

Transformation by � � � �� � � � � � � � � �� � � � � � � �

Time Time

� � ′′

Transformed Time-Window

Page 44: Deep Learning for Privacy and Utility Preserving Sensor

Experimental Results

44

Motion Sense Dataset*

* https://github.com/mmalekzadeh/motion-sense

Page 45: Deep Learning for Privacy and Utility Preserving Sensor

Experimental Results

45

Mobi Act Dataset*

* Vavoulas, George, et al. "The MobiAct Dataset: Recognition of Activities of Daily Living using Smartphones." ICT4AgeingWell. 2016.

Page 46: Deep Learning for Privacy and Utility Preserving Sensor

Code

https://github.com/mmalekzadeh/motion-sense

46

Page 47: Deep Learning for Privacy and Utility Preserving Sensor

47

4. Conclusion and Open Questions

Page 48: Deep Learning for Privacy and Utility Preserving Sensor

Summary

48

• Data generated by motion sensors is informative

– user’s activities

– user’s attributes

• We can train deep autoencoders for data transformation

– release required data

– remove sensitive data

• The trained model can generalize

– model can be used by a for unseen user during training

– Model can be used for on-device data transformation or for offline dataset publishing

Page 49: Deep Learning for Privacy and Utility Preserving Sensor

Open Questions for Future Directions

1. Probabilistic and/or mathematical bound on the privacy and utility guarantees.

o Differential Privacy is not a suitable metric for continual sharing of multi-dimensional data.

o Information Privacy needs a complete knowledge of the data distribution.

2. Correlation among consecutive data release:

o An approach to account and track of the privacy loss occurred

o A Bayesian approach might be useful

3. Datasets including more fine-grained activities and more users.

4. The cost and complexity of such solutions for running them on the edge devices?

Utility Privacy

Cost

49

Page 50: Deep Learning for Privacy and Utility Preserving Sensor

Resources:

– https://github.com/mmalekzadeh/motion-sense

– https://github.com/mmalekzadeh/dana

– https://github.com/mmalekzadeh/replacement-autoencoder

– https://github.com/mmalekzadeh/privacy-preserving-bandits

Q & A

50

Thanks for your attention