privacy-enhancing technologies & applications to ehealth€¦ · privacy-enhancing technologies...
TRANSCRIPT
© 2015 IBM Corporation Anja Lehmann – IBM Research Zurich
IBM Research Zurich
IBM Research – founded in 1945– employees: 3,000– 12 research labs on six continents
IBM Research Zurich – founded in 1956– more than 45 nationalities
– Nobel Laureates: 4• 1986 in Physics by Heinrich Rohrer & Gerd Binnig• 1987 in Physics by Alex Müller & Georg Bednorz
– research areas:• Science & Technology• Industry and Cloud Solutions• Cloud & Computing Infrastructure• Cognitive Computing & Computational Sciences
© 2015 IBM Corporation Anja Lehmann – IBM Research Zurich
IBM Research Zurich
Cognitive Computing & Computational Sciences– next generation cognitive systems and technologies– big data , HPC, secure information management– computational sciences
Security & Privacy group– 13 people: researchers, PostDocs, PhD students, software engineers– focus: privacy-enhancing technologies, provable security
This talk:– why does privacy matter?– what are the risks/limitations of current technologies?– what measures exist to enhance privacy & security?
© 2015 IBM Corporation4 Anja Lehmann – IBM Research Zurich
Why does privacy matter?
storage is becoming increasingly cheaper ― store by default– e.g., intelligence agencies, Google Street View with
wireless traffic, Apple location history
once data is released, it can no longer be controlled– can be copied & distributed – different pieces can be linked and profiles be made
data mining more efficient– not just trend detection, even prediction, e.g., flu pandemics– correlation with illegal criteria, e.g., race, religion
networks and systems badly protected– feature creep, security comes last, if at all– security breaches happen almost every day
© 2015 IBM Corporation5 Anja Lehmann – IBM Research Zurich
… it is far too easy to collect & to loose data !
what are the risks if personal data is lost?
– embarrassment, blackmailing, identity theft, discrimination
security risk: data must be protected accordingly
basic protection techniques:
– reveal only data that is minimally necessary
– avoid globally unique personal identifiers
– strongly protect aquired (personal) data
Why does privacy matter?
© 2015 IBM Corporation6 Anja Lehmann – IBM Research Zurich
avoid globally unique (personal) identifiers & linkability
© 2015 IBM Corporation7 Anja Lehmann – IBM Research Zurich
Data Exchange
how to keep & exchange related data maintained by different entities ?
Health InsuranceHealth
Insurance
HospitalHospitalDoctor B Doctor B
Doctor ADoctor A
Welfare CenterWelfare CenterPharma CompanyPharma Company
© 2015 IBM Corporation8 Anja Lehmann – IBM Research Zurich
Data Exchange | Global Identifier
user data is associated with globally unique identifier– e.g., insurance ID, social security number
ID Data
Alice.1210
Bob.0411
Carol.2503
ID Data
Bob.0411
Carol.2503
Dave.1906
Doctor ADoctor A
HospitalHospital
© 2015 IBM Corporation9 Anja Lehmann – IBM Research Zurich
Data Exchange | Global Identifier
user data is associated with globally unique identifier– e.g., insurance ID, social security number
different entities can easily share & link related data records
ID Data
Bob.0411
Carol.2503
Dave.1906
Record ofBob.0411?
ID Data
Alice.1210
Bob.0411
Carol.2503
HospitalHospital
Doctor ADoctor A
© 2015 IBM Corporation10 Anja Lehmann – IBM Research Zurich
Data Exchange | Global Identifier
user data is associated with globally unique identifier– e.g., insurance ID, social security number
different entities can easily share & link related data records
ID Data
Bob.0411
Carol.2503
Dave.1906
Record ofBob.0411?
ID Data
Alice.1210
Bob.0411
Carol.2503
HospitalHospital
Doctor ADoctor A
+ simple data exchange
– no control about data exchange– if records are lost, pieces can be linked together– data has high-value – requires strong protection
© 2015 IBM Corporation11 Anja Lehmann – IBM Research Zurich
Data Exchange | Global Identifier
user data is associated with globally unique identifier– e.g., insurance ID, social security number
different entities can easily share & link related data records
“random” yet global identifiers not much better– linkability allows re-identification– similar problem: “anonymization” of data sets
e.g., Netflix challenge, credit card transactionsID Data
#247495
#103928
#774510
Record of#247495?
+ simple data exchange
– no control about data exchange– if records are lost, pieces can be linked together– data has high-value – requires strong protection
ID Data
#638801
#247495
#103928
HospitalHospital
Doctor ADoctor A
© 2015 IBM Corporation12 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Trusted Central Authority
central authority derives independent entity-local identifiers from unique identifer
user data is associated with (unlinkable) entity-local identifiers aka “pseudonyms”
ID Data
ML3m5
sD7Ab
y2B4m
ID ID-A ID-H
Alice.1210 Hba02 7twnG
Bob.0411 P89dy ML3m5
Carol.2503 912uj sD7Ab
Dave.1906 5G3wx y2B4m
ID Data
Hba02
P89dy
912uj
HospitalHospital
CentralAuthorityCentral
Authority
Doctor ADoctor A
© 2015 IBM Corporation13 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Trusted Central Authority
central authority derives independent entity-local identifiers from unique identifer
user data is associated with (unlinkable) entity-local identifiers aka “pseudonyms”
only CA can link & convert pseudonyms → central hub for data exchange
ID Data
ML3m5
sD7Ab
y2B4m
ID ID-A ID-H
Alice.1210 Hba02 7twnG
Bob.0411 P89dy ML3m5
Carol.2503 912uj sD7Ab
Dave.1906 5G3wx y2B4m
Record of P89dy ?
Record of ML3m5 ?
ID Data
Hba02
P89dy
912uj
Doctor ADoctor A
HospitalHospital
CentralAuthorityCentral
Authority
© 2015 IBM Corporation14 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Trusted Central Authority
central authority derives independent entity-local identifiers from unique identifer
user data is associated with (unlinkable) entity-local identifiers aka “pseudonyms”
only CA can link & convert pseudonyms → central hub for data exchange
ID Data
ML3m5
sD7Ab
y2B4m
ID ID-A ID-H
Alice.1210 Hba02 7twnG
Bob.0411 P89dy ML3m5
Carol.2503 912uj sD7Ab
Dave.1906 5G3wx y2B4m
Record of P89dy ?
Record of ML3m5 ?
+ control about data exchange+ if records are lost, pieces cannot be linked together+ user can monitor (& control) data flow
– central authority learns all request & knows all correlations
ID Data
Hba02
P89dy
912uj
Doctor ADoctor A
HospitalHospital
CentralAuthorityCentral
Authority
© 2015 IBM Corporation15 Anja Lehmann – IBM Research Zurich
ideally: no party should know the correlation of all pseudonym
© 2015 IBM Corporation16 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Central Authority
central authority & entities jointly derive pseudonyms from unique identifers– entities do not learn unique identifiers, CA does not learn the pseudonyms
user data is associated with pseudonyms
ID Data
ML3m5
sD7Ab
y2B4m
HospitalHospital
ID
Alice.1210
Bob.0411
Carol.2503
Dave.1906
ID Data
Hba02
P89dy
912ujCentral
AuthorityCentral
Authority
Doctor ADoctor A
© 2015 IBM Corporation17 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Central Authority
central authority & entities jointly derive pseudonyms from unique identifers– entities do not learn unique identifiers, CA does not learn the pseudonyms
user data is associated with pseudonyms
only CA can link & convert identifiers → but does so in a blind way
ID Data
ML3m5
sD7Ab
y2B4m
HospitalHospital
ID
Alice.1210
Bob.0411
Carol.2503
Dave.1906
ID Data
Hba02
P89dy
912ujCentral
AuthorityCentral
Authority
Record of P89dy ?
Record of ML3m5 ?
Record of P89dy ?
Record of P89dy ?
Record of P89dy ?
Record of P89dy ?
blind conversion
blind conversion request
unblinding conversion response
Doctor ADoctor A
© 2015 IBM Corporation18 Anja Lehmann – IBM Research Zurich
Data Exchange | Pseudonyms & Central Authority
central authority & entities jointly derive pseudonyms from unique identifers– entities do not learn unique identifiers, CA does not learn the pseudonyms
user data is associated with pseudonyms
only CA can link & convert identifiers → but does so in a blind way
ID Data
ML3m5
sD7Ab
y2B4m
HospitalHospital
ID
Alice.1210
Bob.0411
Carol.2503
Dave.1906+ control about data exchange+ if records are lost, pieces cannot be linked together+ central authority does not learn request (can not even tell if requests are for the same user)
+ central authority can not link data itself
ID Data
Hba02
P89dy
912ujCentral
AuthorityCentral
Authority
Record of P89dy ?
Record of ML3m5 ?
Record of P89dy ?
Record of P89dy ?
Record of P89dy ?
Record of P89dy ?
blind conversion
blind conversion request
unblinding conversion response
Doctor ADoctor A
© 2015 IBM Corporation19 Anja Lehmann – IBM Research Zurich
Summary
if data contains personal identifying information – high value for data thieves→ privacy risk for users→ security risk for data holder: data requires strong protection
“standard pseudonymization” doesn't help – re-identification via linkability
basic protection techniques:
– reveal only data that is minimally necessary
– avoid globally unique personal identifiers
– strongly protect aquired (personal) data
© 2015 IBM Corporation20 Anja Lehmann – IBM Research Zurich
Summary
if data contains personal identifying information – high value for data thieves→ privacy risk for users→ security risk for data holder: data requires strong protection
“standard pseudonymization” doesn't help – re-identification via linkability
basic protection techniques:
– reveal only data that is minimally necessary
– avoid globally unique personal identifiers
– strongly protect aquired (personal) data
privacy-enhancing yet strong authentication
anonymous/pseudonymous consultations, e.g., – online chat with a psychologist– online consultation with IBM Watson– pilot at swedish school for anonymous consultation
© 2015 IBM Corporation21 Anja Lehmann – IBM Research Zurich
Summary
if data contains personal identifying information – high value for data thieves→ privacy risk for users→ security risk for data holder: data requires strong protection
“standard pseudonymization” doesn't help – re-identification via linkability
basic protection techniques:
– reveal only data that is minimally necessary
– avoid globally unique personal identifiers
– strongly protect aquired (personal) data
privacy-enhancing yet strong authentication
anonymous/pseudonymous consultations, e.g., – online chat with a psychologist– online consultation with IBM Watson– pilot at swedish school for anonymous consultation
“virtual trusted hardware”
how to secure confidential data on mobile devices– challenge: security vs. convenience– protection with user password (usually very insecure)
& key server(s) – jointly derive strong key– split-key approach: loosing device ≠ loosing data
© 2015 IBM Corporation22 Anja Lehmann – IBM Research Zurich
privacy-enhancing yet strong authentication
or how to reveal only the data that is minimally necessary
© 2015 IBM Corporation23 Anja Lehmann – IBM Research Zurich
Strong Authentication | Motivation
Online Health Service
Sure, if you have valid insurance.
I'd like to get some health consultation!
© 2015 IBM Corporation24 Anja Lehmann – IBM Research Zurich
Strong Authentication | Motivation
Alice
Online Health Service
digital certificate /credential
Name Alice DoeDate of Birth Dec 12, 1998Address 7 WaterdriveCity 8003 Zurich Insurance SWICAMain ID #1029347Expiry Date Jan 4, 2016
© 2015 IBM Corporation25 Anja Lehmann – IBM Research Zurich
Strong Authentication | Motivation
Alice
Online Health Service
digital certificates for strong authentication
© 2015 IBM Corporation26 Anja Lehmann – IBM Research Zurich
Strong Authentication | Motivation
Alice
Online Health Service
digital certificates for strong authentication
Aha, you are Alice Doe born on Dec 12, 1998 7 Waterdrive CH 8003 Zurich SWICA insured ID #1029347 Expires Jan 4, 2016
This is a privacy and security problem! identity theft profiling discrimination
© 2015 IBM Corporation27 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials solve this.
When Alice authenticates to the Online Health Service, all the service learns is that Alice
has a valid insuranceand no more.
Online Health Service
Sure, if you have valid insurance.
© 2015 IBM Corporation28 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials
Alice
Online Health Service
privacy-enhancing credential for strong yet privacy-enhancing authentication
Name Alice DoeDate of Birth Dec 12, 1998Address 7 WaterdriveCity 8003 Zurich Insurance SWICAMain ID #1029347Expiry Date Jan 4, 2016
© 2015 IBM Corporation29 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials
Alice
Online Health Service
valid subscription
Name Alice DoeDate of Birth Dec 12, 1998Address 7 WaterdriveCity 8003 Zurich Insurance SWICAMain ID #1029347Expiry Date > today
privacy-enhancing credentials allow derivation of authentication tokens–pseudonymous/anonymous authentication– selective attribute disclosure
© 2015 IBM Corporation30 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials
Alice
Online Health Service
privacy-enhancing credentials allow derivation of authentication tokens–pseudonymous/anonymous authentication– selective attribute disclosure
Thanks, you have valid insurance.
© 2015 IBM Corporation31 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials
Alice
Online Health Service
valid subscription
Name Alice DoeDate of Birth Dec 12, 1998Address 7 WaterdriveCity 8003 Zurich Insurance SWICAMain ID #1029347Expiry Date > today
privacy-enhancing credentials allow derivation of authentication tokens–pseudonymous/anonymous authentication– selective attribute disclosure
= ?
user can derive unlinkable token–with different pseudonyms if unlinkability is desired–or re-authenticate under already established pseudonym
© 2015 IBM Corporation32 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials | Use Cases
anonymous/pseudonymous consultations with specialists, e.g., –online chat with a psychologist–online consultation with IBM Watson–not just theory: pilot at swedish school for anonymous consultation
anonymous access to high-value data bases, e.g., DNA databases–who accesses which data at which time can reveal sensitive information about
the users (their research strategy, habits, etc.)
© 2015 IBM Corporation33 Anja Lehmann – IBM Research Zurich
Privacy-Enhancing Credentials & Oblivious Transfer
anonymous/pseudonymous consultations with specialists, e.g., –online chat with a psychologist–online consultation with IBM Watson–not just theory: pilot at swedish school for anonymous consultation
anonymous access to high-value data bases, e.g., DNA databases–who accesses which data at which time can reveal sensitive information about
the users (their research strategy, habits, etc.)
• oblivious data transfer / private information retrieval:user can access data base – gets only data he has authorization for data base does not learn who the user is (but is ensured he has acess rights) & what data the user is fetching
© 2015 IBM Corporation35 Anja Lehmann – IBM Research Zurich
Motivation
How to store confidential data
without assuming trusted user storage ?
challenge: mobile devices can get lost/stolen
© 2015 IBM Corporation36 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
solution?– device encrypts data under password-derived key
psswrd123
ciphertextplaintext
user password pwd pwd
sensitive data
© 2015 IBM Corporation37 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
solution?– device encrypts data under password-derived key
psswrd123
ciphertextplaintext
user password pwd pwd
sensitive data
© 2015 IBM Corporation38 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
solution?– device encrypts data under password-derived key
psswrd123
ciphertextplaintext
user password pwd pwd
sensitive data
© 2015 IBM Corporation39 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
solution?– device encrypts data under password-derived key– device only stores encrypted data, but not the encryption key
© 2015 IBM Corporation40 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
solution?– device encrypts data under password-derived key– device only stores encrypted data, but not the encryption key– to decrypt data, reconstruct key from password
psswrd123
plaintextciphertext
user password pwd' pwd'
© 2015 IBM Corporation41 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
what happens if device gets lost/stolen?
– adversary only learns the encrypted data
© 2015 IBM Corporation42 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
plaintextciphertext what happens if device gets lost/stolen?
– adversary only learns the encrypted data–but he can try to reconstruct key by guessing the password • problem: offline attacks (dictionary attack, brute-force)• 16-char passwords ~ 1 billion possibilities vs. GPUs test billions/second
→ to get reasonable security, ”passwords” must be long, random values → inconvenient to use & hard to memorize
password password123
psswrd123
password password123
psswrd123…
© 2015 IBM Corporation43 Anja Lehmann – IBM Research Zurich
challenge: – how can we get a strong cryptographic key from a weak password?
… without having offline attacks if device is lost
solution: – involve a key server & two-factor authentication towards server
How to protect sensitive data on a mobile device?
© 2015 IBM Corporation44 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
user password pwd
device secretS
server
© 2015 IBM Corporation45 Anja Lehmann – IBM Research Zurich
pwd2
How to protect sensitive data on a mobile device?
user password pwd
device secretS
h = Hash(S,pwd)pwd
server
solution:– two-factor authentication based on user password and device secret
h
pwd1
© 2015 IBM Corporation46 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
user password pwd
device secretS
h = Hash(S,pwd)pwd
h,
server
solution:– two-factor authentication based on user password and device secret– server chooses & stores random encryption key (i.e., independent of pwd or h!)
© 2015 IBM Corporation47 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
user password pwd
device secretS
serverdevice does not store encryption key or any password-derived data
h,
solution:– two-factor authentication based on user password and device secret– server chooses & stores random encryption key (i.e., independent of pwd or h!)
© 2015 IBM Corporation48 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
user password pwd'
device secretS
h' = Hash(S,pwd')pwd'
h' = h ?
server
solution:– two-factor authentication based on user password and device secret– server chooses & stores random encryption key (i.e., independent of pwd or h!)–online password verification to retrieve encryption key
h,
© 2015 IBM Corporation49 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
user password pwd'
device secretS
h' = Hash(S,pwd')pwd'
h' = h ?
server
solution:– two-factor authentication based on user password and device secret– server chooses & stores random encryption key (i.e., independent of pwd or h!)–online password verification to retrieve encryption key
h,
© 2015 IBM Corporation50 Anja Lehmann – IBM Research Zurich
How to protect sensitive data on a mobile device?
device secretS
h* = Hash(S,pwd*)h* = h ?
server
what happens if device gets lost/stolen?– adversary must retrieve decryption key from server – server will recognize suspicious behaviour and block account
→ offline attacks don't work anymore
h,
password password123
psswrd123
blocks verification after too many failed attempts
“wrong password”