tagvisor: a privacy advisor for sharing hashtags · • pick the hashtag set with the highest...

20
Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes

Upload: others

Post on 29-Jul-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

Tagvisor: A Privacy Advisor for Sharing Hashtags

Yang Zhang Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes

Page 2: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!2

Page 3: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!3

Page 4: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!4

Page 5: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!5

Page 6: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!6

#like4like

#foodporn

#tbt

Page 7: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hashtag

!7

#privacy

#locationprivacy

Page 8: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#contributions

• Attack: location inference with hashtags

• Defense: Tagvisor, a privacy advisor to mitigate the privacy threat by hashtags

!8

Page 9: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#dataset

• Collected through Instagram’s APIs

• New York, Los Angeles, and London

• Hashtags + locations (check-ins)

!9

Page 10: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#attack

!10

[1, 1, 1, 0]

[0, 1, 1, 0]

[1, 0, 0, 1]

• Bag-of-words for feature representation

• Random forest classifier

• Multiple-class classification, e.g., 498 classes (locations) in New York

• All posts are trained together

#a#b#c

#b#c

#a#d

Page 11: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#attack

!11

Page 12: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#attack

!12

Page 13: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#tagvisor

• A privacy advisor for sharing hashtags

• Fool the attacker’s location inferencer (ML classifier)

• Three defense mechanisms

• Hiding

• Replacement

• Generalization (location category)

• Utility: preserving the semantical meaning of hashtags

!13

Page 14: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#hiding

!14

hide #a

hide #b

hide #c

successful attack

delete one hashtag (can be more)

#a#b#c

#b#c

#a#c

#a#b

Page 15: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#utility

!15

• Semantical meaning

• Skip-gram, aka word2vec

• Skip-gram over all posts’ hashtags

#a: [3.1, 1.3]#b: [2.5, 1.9]#c: [4.0, 5.1] #a

#b

#c

#a#b#c#a#c

#a#b

Hashtag vectorsd1 d2

d1

d2

#a#b#c#a#c

#a#b

Page 16: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#replacement

!16

• Replace each hashtag with all the possible hashtag

• Search space is too big

• Bound to the most closest hashtags (with word2vec)

• Reduce the search space

• Semantical meaning can be preserved

successful attack #a#b#c

Page 17: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#generalization

• Location category from foursquare

• #centralpark -> #park

• Do not apply to all hashtags

• e.g., #tbt #love

!17

Page 18: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#tagvisor

• Check whether the post’s location is inferred correctly

• If no, then publish

• Else, consider the three defense mechanisms

• Pick the hashtag set with the highest utility

!18

Page 19: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#tagvisor

!19

Obfuscating 2 hashtags is enough!

Obfuscating bounded number of hashtags

Page 20: Tagvisor: A Privacy Advisor for Sharing Hashtags · • Pick the hashtag set with the highest utility!18. #tagvisor!19 Obfuscating 2 hashtags is enough! Obfuscating bounded number

#conclusion

• First location inference attack with hashtags

• Sharing hashtags is not safe!!!

• A privacy advisor to mitigate this risk

• Minimal risk and maximal utility

• Fit for the real-world setting

!20

#thankyou

https://yangzhangalmo.github.io/@yangzhangalmo