acquisti faces blackhat draft
Post on 06-Apr-2018
226 Views
Preview:
TRANSCRIPT
-
8/2/2019 Acquisti Faces BLACKHAT Draft
1/48
AlessandroAcquisti,
Ralph
Gross,
Fred
Stutzman
HeinzCollege&CyLabCarnegieMellonUniversity
PLEASENOTE:DRAFTVERSIONFinalversiontobepresentedatBlackHatUSAonAugust4,2011BlackHat2011
Facesof
Facebook:
PrivacyintheAgeofAugmented
Reality
-
8/2/2019 Acquisti Faces BLACKHAT Draft
2/48
Computerfacerecognitionhasbeenaroundforalongtime
(e.g.:Bledsoe,1964;Kanade,1973)
Computersstillperformmuchworsethanhumanswhen
recognizingfaces
However,automaticfacerecognitionhaskeptimproving,and
hasstartedbeingusedinactualapplications
Especiallyinsecurity,and morerecentlyWeb2.0
-
8/2/2019 Acquisti Faces BLACKHAT Draft
3/48
Face
recognition
in
Web
2.0 GooglehasacquiredNevenVision,Riya,andPittPatt anddeployed
facerecognitionintoPicasa
ApplehasacquiredPolarRose,anddeployedfacerecognitioninto
iPhoto
Facebook
has
licensed
Face.com
to
enable
automated
tagging
So,whatisdifferentaboutthisresearch?
-
8/2/2019 Acquisti Faces BLACKHAT Draft
4/48
Increasingpublic selfdisclosures throughonlinesocialnetworks;
especially,photos
In2010,2.5billionphotosuploadedbyFacebookusersalonepermonth
Identified profiles
in
online
social
networks IndividualsusingtheirrealfirstandlastnamesonFacebook,LinkedIn,Google+,etc.
Continuingimprovements infacerecognitionaccuracy
In1997,
the
best
face
recognizer
in
FERET
program
achieved
afalse
reject
rate
of
0.54
(atfalseacceptrateof0.001)
By2006, thefalserejectratewasdownto0.01
-
8/2/2019 Acquisti Faces BLACKHAT Draft
5/48
Statisticalreidentification: dataminingallowssurprising,sensitive
inferencesfrom
public
data
UScitizensidentifiablefromzip,DOB,gender(Sweeney,1997);Netflixprizede
anonymization (NarayananandShmatikov,2006);SSNpredictionsfromFacebook
profiles(Acquisti
and
Gross,
2009)
Cloud computing
Makesitfeasibleandeconomictorunmillionsoffacecomparisonsinseconds
Ubiquitous computing
Combinedwithcloudcomputing,makesitpossibletorunfacerecognitionthrough
mobiledevices e.g.,smartphones
-
8/2/2019 Acquisti Faces BLACKHAT Draft
6/48
Theconvergeofthesetechnologiesisdemocratizing
surveillance
NotjustWeb2.0facerecognitionappslimitedand
constrainedto
consenting/opt
in
users,
but
.aworldwhereanyonemayrunfacerecognitionon
anyone
else,
online
and
offline
-
8/2/2019 Acquisti Faces BLACKHAT Draft
7/48
Yourfaceistheveritablelink betweenyourofflineidentityand
youronlineidentit(ies)
Dataaboutyourfaceandyournameis,mostlikely,already
publiclyavailable
online
Hence,facerecognitioncreatesthepotentialforyourfacein
thestreet(oronline) tobelinkedtoyouronlineidentit(ies),as
wellastothesensitiveinferencesthatcanbemadeaboutyou
afterblendingtogetherofflineandonlinedata
-
8/2/2019 Acquisti Faces BLACKHAT Draft
8/48
Thisseamlessmergingofonlineandofflinedataraisestheissue
ofwhat privacywillmeaninsuchaugmentedrealityworld
Throughsocialnetworks,havewecreatedadefacto,unregulatedRealID
infrastructure?
-
8/2/2019 Acquisti Faces BLACKHAT Draft
9/48
Ourresearchinvestigatesthefeasibilityofcombining
publiclyavailable onlinesocialnetworkdatawithoffthe
shelffacerecognitiontechnologyforthepurposeoflarge
scale,automated,peerbased
1. individualreidentification,onlineandoffline
2. accretionand
linkage
of
online,
potentially
sensitive,
data to
someonesfaceintheofflineworld
-
8/2/2019 Acquisti Faces BLACKHAT Draft
10/48
Democratizationofsurveillance
Facesasconduitsbetweenonlineandofflinedata
The
emergence
of
PPI:
personally
predictable
information
Theriseofvisual,facialsearches
Thefuture
of
privacy
in
aworld
of
augmented
reality
-
8/2/2019 Acquisti Faces BLACKHAT Draft
11/48
Experiment1:OnlinetoonlineReIdentification
Experiment2:OnlinetoofflineReIdentification
Experiment3:OnlinetoofflineSensitiveInferences
-
8/2/2019 Acquisti Faces BLACKHAT Draft
12/48
UnIdentifiedDB IdentifiedDB PersonalProfilesonMatch.com,Prosper.com,etc. Photorepositories(e.g.,Flickr)
Openwebcams CCTVs
Yourfaceonthestreet
[]
PersonalProfilesonFacebook.com, Linkedin,etc. Govt orcorporatedatabases []
Additionalsensitiveinferences(e.g.sexualorientation,SSN,etc.)
Facerecognition[1]allowsto
matchasubjectinanun
identifiedDBfromdatainanidentifiedDB[2]
Oncethatisdone,sensitivedata
inferredfromtheunidentified
DB[3]canbelinkedtothe
identityofthesubjectinthe
identifiedDB[4]
[1]
[3]
[4]
[2]
-
8/2/2019 Acquisti Faces BLACKHAT Draft
13/48
Onlinetoonline
Weminedpubliclyavailableimagesfromonlinesocial
networkprofilestoreidentifyprofilesononeofthemost
populardatingsitesintheUS
WeusedPittPatt facerecognizer(Nechyba,Brandy,and
Schneiderman,2007)
for:
Facedetection:automaticallylocatinghumanfacesindigitalimages
Facerecognition:measuringsimilaritybetweenanypairoffacestodetermine
ifthey
are
of
the
same
person
-
8/2/2019 Acquisti Faces BLACKHAT Draft
14/48
Facebookprofiles
WedownloadedprimaryprofilephotosforFacebookprofilesfrom
aNorthAmericancityusingasearchenginesAPI(i.e.,without
evenlogging
on
the
Facebook
itself)
Noisyprofilesearchpattern:Combinationofsearchstrategies
(currentlocation,memberoflocalnetworks,fanoflocal
companies/teams,etc.)
-
8/2/2019 Acquisti Faces BLACKHAT Draft
15/48
Datingsiteprofiles
Profilesweremembersofoneofthemostpopulardatingsitesin
theUS
Membersuse
pseudonyms
to
protect
their
identities
However,facialimagesmaymakemembersrecognizablenotjust
byfriends,butbystrangers
Unfeasibleifdonemanually(hundredsofmillionsofpotentialmatchesto
verify),butquitefeasibleusingfacerecognition+cloudcomputing
-
8/2/2019 Acquisti Faces BLACKHAT Draft
16/48
OverlapbetweenourdatingsitedataandFacebookdatais
inherentlynoisy
(geographical
search
vs.
keywords
search)
WerantwosurveystoestimateFacebook/datingsitemembers
overlap Then,multiplehumancodersgradedmatchedpairstoevaluate
facerecognizersaccuracy
-
8/2/2019 Acquisti Faces BLACKHAT Draft
17/48
Oneoutof10datingsitespseudonymousmemberswas
identified
Note:
In
Experiment
1,
we
constrained
ourselves
to
using
only
a
single
Facebook
(primary profile)photo,andonlyconsideringthetopmatchreturnedbythe
recognizer
However:
Because
an
attacker
can
use
more
photos,
and
test
more
matches,
ratio
of
re
identifiableindividualswilldramaticallyincrease
See,infact,Experiment2
Also:asfacerecognizersaccuracyincreases,sodoestheratioofre
identifiableindividuals
-
8/2/2019 Acquisti Faces BLACKHAT Draft
18/48
Offlinetoonline
WeusedpubliclyavailableimagesfromaFacebook
Collegenetworktoidentifystudentsstrollingoncampus
-
8/2/2019 Acquisti Faces BLACKHAT Draft
19/48
Collegephotos
Weused
awebcam
to
take
3photos
per
participant
PhotosgatheredovertwodaysinNovember
-
8/2/2019 Acquisti Faces BLACKHAT Draft
20/48
Weaskstudentswalkingbytostopandhavetheirpicturetaken
Then,weaskedparticipantstoansweranonlinesurveyabout
Facebookusage
Inthe
meanwhile,
face
matching
was
taking
place
on
an
cloud
computingservice
Thelastpageofthesurveywaspopulateddynamicallywiththe
bestmatching
pictures
found
by
recognizer
Participantswereaskedtoselectphotosinwhichtheyrecognized
themselves
-
8/2/2019 Acquisti Faces BLACKHAT Draft
21/48
-
8/2/2019 Acquisti Faces BLACKHAT Draft
22/48
Roughlyoneofoutthreesubjectswasidentified
Averagecomputationtimepersubject:lessthanthreeseconds
-
8/2/2019 Acquisti Faces BLACKHAT Draft
23/48
InExperiment2wefoundtheFacebookprofilescontainingimages
thatmatchedthefacialfeaturesofstudentsworkingoncampus
But:in2009,weusedFacebookprofileinformationtopredict
individualsSocial
Security
numbers
AcquistiandGross,PredictingSocialSecurityNumbersfromPublicData,
ProceedingsoftheNationalAcademyofScience,2009
-
8/2/2019 Acquisti Faces BLACKHAT Draft
24/48
+ =
-
8/2/2019 Acquisti Faces BLACKHAT Draft
25/48
+ = SSN
-
8/2/2019 Acquisti Faces BLACKHAT Draft
26/48
+ = SSN
I.e., predicting SSNs from faces
-
8/2/2019 Acquisti Faces BLACKHAT Draft
27/48
Experiment3wasaboutpredictingpersonalandsensitive
informationfromaface
Wetrainedanalgorithmtoautomaticallyidentifythemostlikely
Facebookprofile
owner
given
amatch
between
the
Experiment
2
subjectsphotosandadatabaseofFacebookimages
Fromthepredictedprofiles,weinferrednames,DOBs,other
demographicinformation,
as
well
as
interests/activities
of
the
subjects
Withthatinformation,wepredictedtheparticipantsSSNs
WethenaskedparticipantsinExperiment2whomwehadthusly
identifiedto
participate
in
afollow
up
study
-
8/2/2019 Acquisti Faces BLACKHAT Draft
28/48
Inthefollowupstudy,weaskedparticipantstoverifyour
predictionsabouttheir:
Interests/Activities(fromFacebookprofiles)
SSNs
first
five
digits
(predicted
using
Acquisti
and
Gross,
2009s
algorithm)
Note:last4digitsarepredictabletoo(seeAcquistiandGross,2009).Prediction
accuracyvariesgreatly,asfunctionofstateandyearofbirth,andcanbecorrectly
estimatedonlywithlargersamplesizesthatwhatavailableinExperiment3
-
8/2/2019 Acquisti Faces BLACKHAT Draft
29/48
Source: http://www.director-thailand.com/blog/what-is-augmented-reality
-
8/2/2019 Acquisti Faces BLACKHAT Draft
30/48
Ourdemosmartphoneappcombinesandextendstheprevious
experimentsto
allow:
Personalandsensitiveinferences
Fromsomeonesface
Inrealtime
Onamobiledevice
Overlayinginformation
(obtained
online)
over
the
image
of
the
individual
(obtainedoffline)onthemobiledevicesscreen
-
8/2/2019 Acquisti Faces BLACKHAT Draft
31/48
SourcesofonlinedatacanbeFacebook(toidentifysomeones
name),Spokeo (oncesomeonesnamehasbeenidentified)
andthen,thesensitiveinferencesonecanmakebasedon
thatdata (e.g.,
SSNs,
but
also
sexual
orientation,
credit
scores,
etc.)
Thatis:theemergenceofpersonallypredictableinformationfroma
personsface
-
8/2/2019 Acquisti Faces BLACKHAT Draft
32/48
-
8/2/2019 Acquisti Faces BLACKHAT Draft
33/48
-
8/2/2019 Acquisti Faces BLACKHAT Draft
34/48
Availabilityoffacialimages
Legaland
technical
implications
of
mining
identified
images
from
online
sources
Cooperativesubjects
Facerecognizersperformworseinabsenceofcleanfrontalphotos
Onthestreet,cleanandfrontalphotosofuncooperativestrangersareunlikely
Geographicalrestrictions
Experiment1focusedonCityarea(~330k individuals).Experiment2focusedon
Collegecommunity(~25kindividuals)
Asthesetofpotentialtargetsgetslarger(e.g.,nationwide),computationsneededfor
facerecognitiongetlessaccurate(i.e.,morefalsepositives),andtakemoretime
-
8/2/2019 Acquisti Faces BLACKHAT Draft
35/48
Facerecognitionofeveryone/everywhere/allthetimeisnot yet
feasible
However: Currenttechnologicaltrendssuggestthatmostcurrent
limitationswill
keep
fading
over
time
-
8/2/2019 Acquisti Faces BLACKHAT Draft
36/48
Thereexistlegalandtechnicalconstraintstominingidentified
imagesfromonlinesources
However:
Manysources
are
publicly
available
(e.g.,
do
not
require
login,
such
as
LinkedInprofilephotos;orcanbesearchedthroughsearchengines,suchas
Facebookprimaryprofilephotos:seeExperiment1)
Facerecognition
companies
are
already
collaborating
with
social
network
sitestotagbillionsofimages(e.g.,seeFace.comrecentannouncement)
Taggingself,andothers,inphotoshasbecomesociallyacceptable infact,
widespread(thus
providing
agrowing
source
of
identified
images)
-
8/2/2019 Acquisti Faces BLACKHAT Draft
37/48
Assearchenginesentersthefacerecognitionspace,facial
visualsearches
may
become
as
common
as
todays
text
basedsearches
Text
based
searches
of
someones
name
across
the
WWW,
which
are
commonnow,wereunimaginable15yearsago(beforesearchengines)
Fromspidered &indexedhtmlpages,tospidered &indexedphoto
Googlehas
already
announced
searches
based
on
image
(although
not
facialimage)
patternmatching
ThenumberofSiliconValleyplayersenteringthisspaceinrecentmonths
demonstrates
the
commercial
interest
in
face
recognition
-
8/2/2019 Acquisti Faces BLACKHAT Draft
38/48
Whatwedidonthestreetwithmobiledevicestoday(requiring
pointandshootandcooperativesubjects),willbeaccomplished
inlessintrusivewaystomorrow
Glasses(already
happening:
Brazilian
police
preparing
for
2014
World
Cup)
Howlongbeforeitcanbedoneon.contactlenses? Facerecognizerswillkeepgettingbetteratmatchingfacesbasedon
nonfrontal
images
(compare
PittPatt version
5.2
vs.
version
4.2)
-
8/2/2019 Acquisti Faces BLACKHAT Draft
39/48
Asthesetofpotentialtargetsgetslarger(e.g.,nationwideDBof
individuals),the
computations
needed
for
face
recognition
get
lessaccurate(morefalsepositives)andtakemoretime
However:
databases
of
identified
images
are
getting
larger,
with
more
individualsareinthem(seepreviousslides)
Accuracy(numberoffalsepositives,numberoffalsenegatives)offace
recognizerssteadily
increases
over
time
especially
so
in
last
few
years
Cloudcomputingclusterswillkeepgettingfaster,larger(morememory
available==largertargetDBsfeasibletoanalyze),andcheaper,making
massiveface
comparisons
economical
-
8/2/2019 Acquisti Faces BLACKHAT Draft
40/48
Web2.0profiles(e.g.Facebook)arebecomingdefactounregulated
Real
IDs
SeerecentFTCsapprovalofSocialIntelligenceCorporationssocialmediabackgroundchecks
Greatpotentialforcommerceandecommerce
ImagineMinorityReportstyleadvertising
however,
happening
much
earlier
than
2054
-
8/2/2019 Acquisti Faces BLACKHAT Draft
41/48
-
8/2/2019 Acquisti Faces BLACKHAT Draft
42/48
Optinisineffective asprotection,sincemostdataisalready
publiclyavailable
E.g.,Facebooksetsprimaryprofilephotostobevisibletoallbydefault,
andmemberstosignuptothenetworkwiththeirrealidentities
-
8/2/2019 Acquisti Faces BLACKHAT Draft
43/48
Whatwill privacymeaninaworldwhereastrangeronthe
streetcouldguessyourname,interests,SSNs,orcreditscores?
Thecomingageofaugmentedreality,inwhichonlineand
offlinedata
are
blended
in
real
time,
may
force
us
to
reconsiderournotionsofprivacy
-
8/2/2019 Acquisti Faces BLACKHAT Draft
44/48
Infact,augmentedrealitymayalsocarrydeepreaching
behavioralimplications
Throughnaturalevolution,humanbeingshaveevolvedmechanismsto
assign
and
manage
trust
in
face
to
face
interactions Willwerelyonourinstincts,oronourdevices,whenmobiledevices
maketheirownpredictionsabouthiddentraitsofapersonwearelooking
at?
-
8/2/2019 Acquisti Faces BLACKHAT Draft
45/48
Democratizationofsurveillance
Facesasconduitsbetweenonlineandofflinedata
TheemergenceofPPI:personallypredictableinformation
Theriseofvisual,facialsearches
The
future
of
privacy
in
a
world
of
augmented
reality
-
8/2/2019 Acquisti Faces BLACKHAT Draft
46/48
Wegratefullyacknowledgeresearchsupportfrom
NationalScienceFoundationunderGrant0713361
U.S.ArmyResearchOfficeunderContractDAAD190210389
HeinzCollege
CarnegieMellonCyLab
CarnegieMellon
Berkman Fund
-
8/2/2019 Acquisti Faces BLACKHAT Draft
47/48
MainRAs:GaneshRajManickaRaju,MarkusHuber,Nithin
Betegeri,NithinReddy,Varun Gandhi,AaronJaech,Venkata
Tumuluri
AdditionalRAs:
Aravind
Bharadwaj,
Laura
Brandimarte,
Samita
Dhanasobhon,HazelDianaMary,NitinGrewal,AnujGupta,
SnigdhaNayak,RahulPandey,SoumyaSrivastava,Thejas
Varier,NarayanaVenkatesh
-
8/2/2019 Acquisti Faces BLACKHAT Draft
48/48
Google:economicsprivacy
Visit:http://www.heinz.cmu.edu/~acquisti/economics
privacy.htm
Email:acquisti@andrew.cmu.edu
top related